论文信息 - RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning

RLLib is a lightweight C++ template library that implements incremental, standard, and gradient temporal-difference learning algorithms in reinforcement learning. It is an optimized library for robotic applications and embedded devices that operates under fast duty cycles e.g., $$\le $$30i¾źms. RLLib has been tested and evaluated on RoboCup 3D soccer simulation agents, NAO V4 humanoid robots, and Tiva C series launchpad microcontrollers to predict, control, learn behavior, and represent learnable knowledge.

Ubbo Visser | Saminda Abeyruwan | Saminda Abeyruwan | U. Visser

[1] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.

[2] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.

[3] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[4] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.

[5] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Peter Stone,et al. Autonomous transfer for reinforcement learning , 2008, AAMAS.

[7] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[8] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI and Robotics , 1997, RoboCup.

[9] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[10] Hervé Frezza-Buet,et al. A C++ template-based reinforcement learning library: fitting the code to the mathematics , 2013, J. Mach. Learn. Res..

[11] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .

[12] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.

[13] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.

[14] Tim Kovacs,et al. On the analysis and design of software for reinforcement learning, with a survey of existing systems , 2011, Machine Learning.

[15] Olivier Buffet,et al. Markov Decision Processes in Artificial Intelligence , 2010 .

[16] Andrew G. Barto,et al. Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.

[17] Ubbo Visser,et al. Dynamic role assignment using general value functions , 2013, ALA 2013.

[18] Richard S. Sutton,et al. True Online TD(lambda) , 2014, ICML.

[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[21] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[22] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.

[23] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).

[24] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.

[25] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[26] Olivier Buffet,et al. Markov Decision Processes in Artificial Intelligence: Sigaud/Markov Decision Processes in Artificial Intelligence , 2013 .

[27] Andreas Seekircher,et al. Accurate Ball Tracking with Extended Kalman Filters as a Prerequisite for a High-level Behavior with Reinforcement Learning , 2011 .

[28] Richard S. Sutton,et al. True online TD(λ) , 2014, ICML 2014.

[29] Pawel Wawrzynski,et al. dotRL: A platform for rapid Reinforcement Learning methods development and validation , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[30] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[31] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..