The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
暂无分享,去创建一个
[1] G. D. Liveing,et al. The University of Cambridge , 1897, British medical journal.
[2] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[3] A. L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[4] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1976, TOMS.
[5] G. Siouris,et al. Optimum systems control , 1979, Proceedings of the IEEE.
[6] Hendrik Van Brussel,et al. A self-learning automaton with variable resolution for high precision assembly by industrial robots , 1982 .
[7] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[8] Rodney A. Brooks,et al. A subdivision algorithm in configuration space for findpath with rotation , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[9] Larry S. Davis,et al. Multiresolution path planning for mobile robots , 1986, IEEE J. Robotics Autom..
[10] R. Hoppe. Multi-grid methods for Hamilton-Jacobi-Bellman equations , 1986 .
[11] Jean-Philippe Chancelier,et al. Dynamic programming complexity and application , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.
[12] Chee-Seng Chow,et al. Multigrid algorithms and complexity results for discrete-time stochastic control and related fixed-point problems , 1989 .
[13] V. Rich. Personal communication , 1989, Nature.
[14] Stephen F. McCormick,et al. Multilevel adaptive methods for partial differential equations , 1989, Frontiers in applied mathematics.
[15] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[16] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[17] Related Fields,et al. Numerical grid generation in computational fluid dynamics and related fields : proceedings of the Third International Conference on Numerical Grid Generation in Computational Fluid Dynamics and Related Fields, Barcelona, Spain, 3-7 June 1991 , 1991 .
[18] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[19] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[20] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[21] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[22] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.
[23] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[24] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[25] Timothy J. Purcell. Sorting and searching , 2005, SIGGRAPH Courses.
[26] G. Swaminathan. Robot Motion Planning , 2006 .