Introduction to the special issue on empirical evaluations in reinforcement learning

The field of reinforcement learning aims to develop algorithms that use experience to optimize behavior in sequential decision problems such as (partially observable) Markov decision processes ((PO)MDPs). In such problems, an autonomous agent interacts with an external environment by selecting actions and seeks the sequence of actions that maximizes its long-term performance. In reinforcement learning, the environment is typically initially unknown and learning takes place online, that is, the agent’s performance is assessed throughout learning instead of only afterwards. Since many challenging and realistic tasks are well described as sequential decision problems, the development of effective reinforcement-learning algorithms plays an important role in artificial intelligence research. Recent years have seen enormous progress, both in the development of new methods and the theoretical understanding of existing methods. In particular, great strides have been made in approximating value functions, exploring efficiently, learning in the presence of multiple agents, coping with partial observability, inducing models, and reasoning hierarchically. The focus of this special issue is not the development of new algorithms but the empirical evaluation of existing ones. Like other machine-learning methods, reinforcement-learning approaches are typically evaluated in one or more of the following three ways: (1) subjectively, (2) theoretically, and (3) empirically. Subjective evaluations, in which researchers assess the significance and potential of new ideas, are important because they leverage the powerful intuition of experts to guide the research process. However, they are also limited because they cannot validate ideas that go against such intuition; for example, they cannot expose fallacious assumptions. Theoretical evaluations are also important, as they are