Research Area:  Machine Learning
Reinforcement learning (RL) algorithms find applications in inventory control, recommender systems, vehicular traffic management, cloud computing, and robotics. The real-world complications arising in these domains makes them difficult to solve with the basic assumptions underlying classical RL algorithms. RL agents in these applications often need to react and adapt to changing operating conditions. A significant part of research on single-agent RL techniques focuses on developing algorithms when the underlying assumption of stationary environment model is relaxed. This article provides a survey of RL methods developed for handling dynamically varying environment models. The goal of methods not limited by the stationarity assumption is to help autonomous agents adapt to varying operating conditions. This is possible either by minimizing the rewards lost during learning by RL agent or by finding a suitable policy for the RL agent that leads to efficient operation of the underlying system. A representative collection of these algorithms is discussed in detail in this work along with their categorization and their relative merits and demerits. Additionally, we also review works that are tailored to application domains. Finally, we discuss future enhancements for this field.
Author(s) Name:  Sindhu Padakandla
Journal name:  ACM Computing Surveys
Publisher name:  ACM
Volume Information:  Volume 54,Issue 6, July 2022 Article No.: 127, pp 1–25
Paper Link:   https://dl.acm.org/doi/10.1145/3459991