Efficient experience reuse in non-Markovian environments
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Jürgen Schmidhuber,et al. Learning to forget: continual prediction with LSTM , 1999 .
[3] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[4] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[5] Risto Miikkulainen,et al. Efficient Non-linear Control Through Neuroevolution , 2006, ECML.
[6] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[7] Mohamed S. Kamel,et al. Reinforcement learning using a recurrent neural network , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[8] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[9] Takashi Komeda,et al. REINFORCEMENT LEARNING FOR POMDP USING STATE CLASSIFICATION , 2008, MLMTA.
[10] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[11] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[12] Peter Stone,et al. Batch reinforcement learning in a complex domain , 2007, AAMAS '07.
[13] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[14] Jürgen Schmidhuber,et al. Quasi-online reinforcement learning for robots , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Douglas C. Hittle,et al. Robust Reinforcement Learning Control Using Integral Quadratic Constraints for Recurrent Neural Networks , 2007, IEEE Transactions on Neural Networks.
[17] Longxin Lin,et al. Reinforcement Learning in Non-Markov Environments , 1992 .
[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[19] I. Noda,et al. Using suitable action selection rule in reinforcement learning , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).
[20] Jürgen Schmidhuber,et al. Training Recurrent Networks by Evolino , 2007, Neural Computation.
[21] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[22] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[23] Thomas Martinetz,et al. Improving Optimality of Neural Rewards Regression for Data-Efficient Batch Near-Optimal Policy Identification , 2007, ICANN.
[24] Samuel W. Hasinoff,et al. Reinforcement Learning for Problems with Hidden State , 2003 .
[25] Hajime Kita,et al. Recurrent neural networks for reinforcement learning: architecture, learning algorithms and internal representation , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).
[26] Junku Yuh,et al. Application of SONQL for real-time learning of robot behaviors , 2007, Robotics Auton. Syst..
[27] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[28] Jürgen Schmidhuber,et al. A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[29] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .