Incremental Reinforcement Learning With Prioritized Sweeping

Research Area: Machine Learning

Abstract:

In this paper, a novel incremental learning algorithm is presented for reinforcement learning (RL) in dynamic environments, where the rewards of state-action pairs may change over time. The proposed incremental RL (IRL) algorithm learns from the dynamic environments without making any assumptions or having any prior knowledge about the ever-changing environment. First, IRL generates a detector-agent to detect the changed part of the environment (drift environment) by executing a virtual RL process. Then, the agent gives priority to the drift environment and its neighbor environment for iteratively updating their state-action value functions using new rewards by dynamic programming. After the prioritized sweeping process, IRL restarts a canonical learning process to obtain a new optimal policy adapting to the new environment. The novelty is that IRL fuses the new information into the existing knowledge system incrementally as well as weakening the conflict between them. The IRL algorithm is compared to two direct approaches and various state-of-the-art transfer learning methods for classical maze navigation problems and an intelligent warehouse with multiple robots. The experimental results verify that IRL can effectively improve the adaptability and efficiency of RL algorithms in dynamic environments.

Keywords:

Author(s) Name: Zhi Wang; Chunlin Chen; Han-Xiong Li; Daoyi Dong; Tzyh-Jong Tarn

Journal name: IEEE/ASME Transactions on Mechatronics

Conferrence name:

Publisher name: IEEE

DOI: 10.1109/TMECH.2019.2899365

Volume Information: ( Volume: 24, Issue: 2, April 2019) Page(s): 621 - 632

Paper Link: https://ieeexplore.ieee.org/abstract/document/8642342

Office Address

Social List