Deep reinforcement learning (DRL) is the revolutionary research area that integrates reinforcement learning (DRL) with deep learning (DL). DRL is an effective learning model due to its large action spaces and continuous real-world applications. RL methods are inability to find optimal policy function and limited action spaces whereas, DRL possesses the ability to handle a high-dimensional environment and provide a value-function approximation based on the deep neural networks. Internet of things(IoT) expands the internet connectivity into billions of IoT devices to collect, access, and share information.
Some of the essential issues of IoT are poor scalability and elasticity exhibited in communication, computing, caching, and control (4Cs) problems. DRL methods acquire the potential to address such problems of IoT. The significance of DRL in IoT is high scalability, long-term performance optimization, real-time decision making, and online learning without prior knowledge. DRL-based IoT application areas include the smart grid, the intelligent transportation system (ITS), industrial IoT (IIoT) applications, mobile crowdsensing (MCS), blockchain empowered IoT. Future developments and recent research of DRL-based IoT systems are the integration of federated learning in DRL for IoT, meta-learning DRL for IoT system training, model based DRL in IoT systems, distributed RL in IoT, and DRL for IoT networks with noise and complex environments.