As Expected ? An Analysis of Distributional Reinforcement Learning

Distributional reinforcement learning, in which an agent predicts distributions of returns instead of their expected values, has seen empirical success in several Atari 2600 games, outperforming both the human baseline and previously state-ofthe-art algorithms. It remains unclear precisely what drives this improvement in performance over traditional reinforcement learning approaches. In this paper, we take initial steps towards answering this question by determining under what conditions the distributional perspective leads to behaviour different from what one would see in the expected case, and conversely when they are equivalent. We supplement the theoretical findings presented in this paper with empirical results in tabular settings.