Light-weight Reinforcement Learning with Function Approximation for Real-life Control Tasks
暂无分享,去创建一个
[1] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.
[2] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[3] Kary Främling,et al. Guiding exploration by pre-existing knowledge without modifying reward , 2007, Neural Networks.
[4] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[5] Kary Främling. Replacing eligibility trace for action-value learning with function approximation , 2007, ESANN.
[6] Kary Främling. Scaled Gradient Descent Learning Rate - Reinforcement Learning with Light-Seeking Robot , 2004, ICINCO.
[7] Andrew W. Moore,et al. Policy Search using Paired Comparisons , 2003, J. Mach. Learn. Res..
[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[9] Steffen Udluft,et al. The Recurrent Control Neural Network , 2007, ESANN.
[10] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[11] H. Wechsler,et al. Competitive reinforcement learning in continuous control tasks , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..
[12] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[13] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[14] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[15] Kary Främling. Adaptive robot learning in a non-stationary environment , 2005, ESANN.
[16] Thomas Martinetz,et al. Neural Rewards Regression for near-optimal policy identification in Markovian and partial observable environments , 2007, ESANN.
[17] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[18] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[19] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[20] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[21] Long Ji Lin,et al. Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..
[22] James S. Albus,et al. Data Storage in the Cerebellar Model Articulation Controller (CMAC) , 1975 .