ReinforcementLearning: A Package to Perform Model-Free Reinforcement Learning in R

Reinforcement learning refers to a group of methods from artificial intelligence where an agent performs learning through trial and error (Sutton & Barto, 1998). It differs from supervised learning, since reinforcement learning requires no explicit labels; instead, the agent interacts continuously with its environment. That is, the agent starts in a specific state and then performs an action, based on which it transitions to a new state and, depending on the outcome, receives a reward. Different strategies (e.g. Q-learning) have been proposed to maximize the overall reward, resulting in a so-called policy, which defines the best possible action in each state. As a main advantage, reinforcement learning is applicable to situations in which the dynamics of the environment are unknown or too complex to evaluate (e.g. Mnih et al., 2015). However, there is currently no package available for performing reinforcement learning in R. As a remedy, we introduce the ReinforcementLearning R package, which allows an agent to learn the optimal behavior based on sampling experience consisting of states, actions and rewards (Pröllochs & Feuerriegel, 2017). Based on such training examples, the package allows a reinforcement learning agent to learn an optimal policy that defines the best possible action in each state. Main features of ReinforcementLearning include, but are not limited to: