Distributional reinforcement learning refers to the process of learning to predict the complex and entire probability distribution over rewards of the agentâ€™s environment. Challenges of deep reinforcement learning such as sparsity of rewards, high complexity, and scalability are controlled by distributional reinforcement learning. Distributional reinforcement learning represents the random variable reward instead of the expected immediate reward.
The key goal of distributional reinforcement learning emphasizes the algorithms that predict the future reward as return which is the summation of future discounted rewards. Returns from the distributional RL are complex multimodal and models all the possibilities. The distributions in RL are represented as categorical, inverse categorical, or parametric inverse categorical. Distributional reinforcement learning models the distribution over returns accurately instead of only estimating the mean, leading an agent to utilize more insights and knowledge.
Distributional reinforcement learning is applied in various implementations such as risk-sensitive control, efficient exploration, wave communications, quantile regression and networks, multi-agent and multi-task learning, to name a few.