Distributional Reinforcement Learning Efficient Exploration

Distributional Reinforcement Learning for Efficient Exploration - 2019

Research Area: Machine Learning

Abstract:

In distributional reinforcement learning (RL), the estimated distribution of value functions model both the parametric and intrinsic uncertainties. We propose a novel and efficient exploration method for deep RL that has two components. The first is a decaying schedule to suppress the intrinsic uncertainty. The second is an exploration bonus calculated from the upper quantiles of the learned distribution. In Atari 2600 games, our method achieves 483 % average gain across 49 games in cumulative rewards over QR-DQN. We also compared our algorithm with QR-DQN in a challenging 3D driving simulator (CARLA). Results show that our algorithm achieves nearoptimal safety rewards twice faster than QRDQN.

Keywords:

Author(s) Name: Borislav Mavrin, Hengshuai Yao, Linglong Kong, Kaiwen Wu, Yaoliang Yu

Journal name:

Conferrence name: Proceedings of the 36th International Conference on Machine Learning

Publisher name: arxiv

DOI: 10.48550/arXiv.1905.06125

Volume Information:

Paper Link: https://arxiv.org/abs/1905.06125

Office Address

Social List

Distributional Reinforcement Learning for Efficient Exploration - 2019

Abstract:

S-Logix (OPC) Private Limited

Office Address

Distributional Reinforcement Learning for Efficient Exploration - 2019

Abstract:

Related Papers