Deep Reinforcement Learning is reactive machine learning that combines reinforcement learning and deep learning to learn useful representations for complex problems with high dimensional raw data input. It provides solutions for the perception and decision-making problems of complex systems by trial and error.
Deep learning empowers reinforcement learning to handle intractable real-world problems. Deep learning models automatically extract complex data representations from high-dimensional input data and outperform other state-of-the-art conventional machine learning methods.
Deep reinforcement learning gains knowledge based on the action to perform efficiently and possesses the learning capability of adapting to the real-world dynamic environment. Deep reinforcement learning is categorized into two methods, including the Deep Q-learning-based method(value-based) and the policy gradient-based method.
The application areas of deep reinforcement learning are games, computer vision, healthcare, robotics, smart grids, finance, and self-driving cars, and natural language processing. Current research in Deep reinforcement learning is hierarchical reinforcement learning, imitation learning, and inverse reinforcement learning, multi-agent reinforcement learning, transfer learning, among others.
• Deep reinforcement learning (RL) combines reinforcement learning (RL) and deep learning to solve the complex decision-making problems that were previously intractable, that is, settings with high-dimensional state and action spaces.
• Due to its ability to learn different levels of abstractions from data, DRL addresses the curse of dimensionality problems faced by traditional RL methods and facilitates to perform complicated tasks with lower prior knowledge.
• DRL is the vision of creating systems capable of learning how to adapt to the real world.
• Deep RL opens up many new applications in various domains such as healthcare, robotics, smart grids, finance, and many more.
• Recently, a standout success was developing a hybrid DRL system utilized by AlphaGo, which is trained using supervised and reinforcement learning in combination with a traditional heuristic search algorithm.