Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network
暂无分享,去创建一个
[1] Jun Tani,et al. Learning to generate articulated behavior through the bottom-up and the top-down interaction processes , 2003, Neural Networks.
[2] Katsunari Shibata,et al. Contextual Behaviors and Internal Representations Acquired by Reinforcement Learning with a Recurrent Neural Network in a Continuous State and Action Space Task , 2008, ICONIP.
[3] Mark B. Ring. Child: A First Step Towards Continual Learning , 1998, Learning to Learn.
[4] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[5] Stewart W. Wilson,et al. A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .
[6] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[7] Jürgen Schmidhuber,et al. Exploring the predictable , 2003 .
[8] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[9] Jürgen Schmidhuber,et al. A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[10] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[11] Hajime Kita,et al. Q-Learning with Recurrent Neural Networks as a Controller for the Inverted Pendulum Problem , 1998, ICONIP.
[12] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[13] Michael H. Bowling,et al. Online Discovery and Learning of Predictive State Representations , 2005, NIPS.
[14] Tom M. Mitchell,et al. Reinforcement learning with hidden states , 1993 .
[15] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[16] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[17] G. Hartmann,et al. Parallel Processing in Neural Systems and Computers , 1990 .
[18] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[19] R. A. Brooks,et al. Intelligence without Representation , 1991, Artif. Intell..
[20] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[21] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[22] Katsunari Shibata,et al. Acquisition of Flexible Image Recognition by Coupling of Reinforcement Learning and a Neural Network , 2009 .