RLLib: C++ Library to Predict, Control, and Represent Learnable Knowledge Using On/Off Policy Reinforcement Learning
暂无分享,去创建一个
[1] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.
[2] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[3] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[4] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[5] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Peter Stone,et al. Autonomous transfer for reinforcement learning , 2008, AAMAS.
[7] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[8] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI and Robotics , 1997, RoboCup.
[9] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[10] Hervé Frezza-Buet,et al. A C++ template-based reinforcement learning library: fitting the code to the mathematics , 2013, J. Mach. Learn. Res..
[11] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[12] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[13] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[14] Tim Kovacs,et al. On the analysis and design of software for reinforcement learning, with a survey of existing systems , 2011, Machine Learning.
[15] Olivier Buffet,et al. Markov Decision Processes in Artificial Intelligence , 2010 .
[16] Andrew G. Barto,et al. Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.
[17] Ubbo Visser,et al. Dynamic role assignment using general value functions , 2013, ALA 2013.
[18] Richard S. Sutton,et al. True Online TD(lambda) , 2014, ICML.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[21] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[22] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[23] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[24] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[25] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[26] Olivier Buffet,et al. Markov Decision Processes in Artificial Intelligence: Sigaud/Markov Decision Processes in Artificial Intelligence , 2013 .
[27] Andreas Seekircher,et al. Accurate Ball Tracking with Extended Kalman Filters as a Prerequisite for a High-level Behavior with Reinforcement Learning , 2011 .
[28] Richard S. Sutton,et al. True online TD(λ) , 2014, ICML 2014.
[29] Pawel Wawrzynski,et al. dotRL: A platform for rapid Reinforcement Learning methods development and validation , 2013, 2013 Federated Conference on Computer Science and Information Systems.
[30] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[31] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..