Acquiring a broad range of empirical knowledge in real time by temporal-difference learning
暂无分享,去创建一个
Patrick M. Pilarski | Richard S. Sutton | Adam White | Joseph Modayil | R. Sutton | Joseph Modayil | P. Pilarski | Adam White
[1] Byron Boots,et al. An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems , 2011, AAAI.
[2] Wolfram Burgard,et al. The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..
[3] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[4] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[5] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[8] Farbod Fahimi,et al. The Development of a Myoelectric Training Tool for Above-Elbow Amputees , 2012, The open biomedical engineering journal.
[9] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[10] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[11] Erik Talvitie,et al. Learning to Make Predictions In Partially Observable Environments Without a Generative Model , 2011, J. Artif. Intell. Res..
[12] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[13] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[14] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[15] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[16] Steven M. LaValle,et al. Planning algorithms , 2006 .
[17] R. S. Sutton,et al. Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).
[18] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..