A robot that reinforcement-learns to identify and memorize important previous observations
暂无分享,去创建一个
Jürgen Schmidhuber | Gabriel Gruener | Bram Bakker | Viktor Zhumatiy | J. Schmidhuber | B. Bakker | Viktor Zhumatiy | G. Gruener
[1] Barruquer Moner. IX. References , 1971 .
[2] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[3] Tom M. Mitchell,et al. Reinforcement learning with hidden states , 1993 .
[4] Michael L. Littman,et al. An optimization-based categorization of reinforcement learning environments , 1993 .
[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[6] Mance E. Harmon,et al. Multi-Agent Residual Advantage Learning with General Function Approximation. , 1996 .
[7] Mark Harmon. Multi-player residual advantage learning with general function , 1996 .
[8] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[10] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[11] Henrik Jacobsson,et al. Mobile Robot Learning of Delayed Response Tasks through Event Extraction: A Solution to the Road Sign Problem and Beyond , 2001, IJCAI.
[12] Jürgen Schmidhuber,et al. Reinforcement learning in partially observable mobile robot domains using unsupervised event extraction , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] Chris A. Czarnecki,et al. Embedding Connectionist Autonomous Agents in Time: The ‘Road Sign Problem’ , 2000, Neural Processing Letters.
[14] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.