Event-learning and robust policy heuristics
暂无分享,去创建一个
[1] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[2] Csaba Szepesvri,et al. An integrated architecture for motion‐control and path‐planning , 1998 .
[3] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[4] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[5] Alberto Isidori,et al. Nonlinear control systems: an introduction (2nd ed.) , 1989 .
[6] András Lörincz,et al. Approximate geometry representations and sensory fusion , 1996, Neurocomputing.
[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[8] G. Lei. A neuron model with fluid properties for solving labyrinthian puzzle , 1990, Biological Cybernetics.
[9] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[10] Narendra Ahuja,et al. Gross motion planning—a survey , 1992, CSUR.
[11] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[12] Kenji Doya,et al. Temporal Difference Learning in Continuous Time and Space , 1995, NIPS.
[13] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[14] Csaba Szepesv Ari,et al. Ockham's razor modeling of the matrisome channels of the basal ganglia thalamocortical loops. , 2001 .
[15] S.H.G. ten Hagen. Continuous State Space Q-Learning for control of Nonlinear Systems , 2001 .
[16] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[17] Andrew G. Barto,et al. DISCRETE AND CONTINUOUS MODELS , 1978 .
[18] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[19] Roderic A. Grupen,et al. The applications of harmonic functions to robotics , 1993, J. Field Robotics.
[20] András Lörincz,et al. Neurocontroller using dynamic state feedback for compensatory control , 1997, Neural Networks.
[21] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[22] András Lörincz,et al. Self-Organizing Multi-Resolution Grid for Motion Planning and Control , 1996, Int. J. Neural Syst..
[23] A. Isidori. Nonlinear Control Systems: An Introduction , 1986 .
[24] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[25] Stan C. A. M. Gielen,et al. Neural Network Dynamics for Path Planning and Obstacle Avoidance , 1995, Neural Networks.
[26] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[28] S.H.G. ten Hagen,et al. Linear Quadratic Regulation using reinforcement learning , 1998 .
[29] Piero Mussio,et al. Toward a Practice of Autonomous Systems , 1994 .