Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning
暂无分享,去创建一个
[1] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[2] Bernd Fritzke,et al. A Growing Neural Gas Network Learns Topologies , 1994, NIPS.
[3] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[4] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[5] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..
[6] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[7] Wolfram Burgard,et al. The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..
[8] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[9] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[10] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[11] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[12] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[13] Doina Precup,et al. Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning , 2004, ECML.
[14] Henry Y. K. Lau,et al. Adaptive state space partitioning for reinforcement learning , 2004, Eng. Appl. Artif. Intell..
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[17] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[18] Steven M. LaValle,et al. Planning algorithms , 2006 .
[19] Keith A. Bush. An echo state model of non-markovian reinforcement learning , 2007 .
[20] André da Motta Salles Barreto,et al. Restricted gradient-descent algorithm for value-function approximation in reinforcement learning , 2008, Artif. Intell..
[21] Fernando Fernández,et al. Two steps reinforcement learning , 2008, Int. J. Intell. Syst..
[22] Stephen Lin,et al. Evolutionary Tile Coding: An Automated State Abstraction Algorithm for Reinforcement Learning , 2010, Abstraction, Reformulation, and Approximation.
[23] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[24] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[25] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[26] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[27] Bart De Schutter,et al. Approximate reinforcement learning: An overview , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[28] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[29] Erik Talvitie,et al. Learning to Make Predictions In Partially Observable Environments Without a Generative Model , 2011, J. Artif. Intell. Res..
[30] Hans Kleine Büning,et al. State Aggregation by Growing Neural Gas for Reinforcement Learning in Continuous State Spaces , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.
[31] Byron Boots,et al. An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems , 2011, AAAI.
[32] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[33] R. S. Sutton,et al. Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).
[34] Richard S. Sutton,et al. Multi-timescale Nexting in a Reinforcement Learning Robot , 2012, SAB.
[35] Farbod Fahimi,et al. The Development of a Myoelectric Training Tool for Above-Elbow Amputees , 2012, The open biomedical engineering journal.