RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

RLLib is a lightweight C++ template library that implements incremental, standard, and gradient temporal-difference learning algorithms in reinforcement learning. It is an optimized library for robotic applications and embedded devices that operates under fast duty cycles e.g., $$\le $$30i¾źms. RLLib has been tested and evaluated on RoboCup 3D soccer simulation agents, NAO V4 humanoid robots, and Tiva C series launchpad microcontrollers to predict, control, learn behavior, and represent learnable knowledge.

[1]  Richard S. Sutton,et al.  On the role of tracking in stationary environments , 2007, ICML '07.

[2]  Martha White,et al.  Linear Off-Policy Actor-Critic , 2012, ICML.

[3]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[4]  Shalabh Bhatnagar,et al.  Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.

[5]  Patrick M. Pilarski,et al.  Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Peter Stone,et al.  Autonomous transfer for reinforcement learning , 2008, AAMAS.

[7]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[8]  Hiroaki Kitano,et al.  RoboCup: A Challenge Problem for AI and Robotics , 1997, RoboCup.

[9]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[10]  Hervé Frezza-Buet,et al.  A C++ template-based reinforcement learning library: fitting the code to the mathematics , 2013, J. Mach. Learn. Res..

[11]  R. Sutton,et al.  Gradient temporal-difference learning algorithms , 2011 .

[12]  Martha White,et al.  Linear Off-Policy Actor-Critic , 2012, ICML.

[13]  George Konidaris,et al.  Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.

[14]  Tim Kovacs,et al.  On the analysis and design of software for reinforcement learning, with a survey of existing systems , 2011, Machine Learning.

[15]  Olivier Buffet,et al.  Markov Decision Processes in Artificial Intelligence , 2010 .

[16]  Andrew G. Barto,et al.  Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.

[17]  Ubbo Visser,et al.  Dynamic role assignment using general value functions , 2013, ALA 2013.

[18]  Richard S. Sutton,et al.  True Online TD(lambda) , 2014, ICML.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Patrick M. Pilarski,et al.  Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[21]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[22]  Richard S. Sutton,et al.  GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.

[23]  Patrick M. Pilarski,et al.  Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).

[24]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[25]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[26]  Olivier Buffet,et al.  Markov Decision Processes in Artificial Intelligence: Sigaud/Markov Decision Processes in Artificial Intelligence , 2013 .

[27]  Andreas Seekircher,et al.  Accurate Ball Tracking with Extended Kalman Filters as a Prerequisite for a High-level Behavior with Reinforcement Learning , 2011 .

[28]  Richard S. Sutton,et al.  True online TD(λ) , 2014, ICML 2014.

[29]  Pawel Wawrzynski,et al.  dotRL: A platform for rapid Reinforcement Learning methods development and validation , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[30]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[31]  Brian Tanner,et al.  RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..