The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
暂无分享,去创建一个
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] Nils J. Nilsson,et al. Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.
[3] Donald E. Knuth,et al. Sorting and Searching , 1973 .
[4] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.
[5] G. Siouris,et al. Optimum systems control , 1979, Proceedings of the IEEE.
[6] Hendrik Van Brussel,et al. A self-learning automaton with variable resolution for high precision assembly by industrial robots , 1982 .
[7] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[8] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[9] Rodney A. Brooks,et al. A subdivision algorithm in configuration space for findpath with rotation , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[10] Larry S. Davis,et al. Multiresolution path planning for mobile robots , 1986, IEEE J. Robotics Autom..
[11] R. Hoppe. Multi-grid methods for Hamilton-Jacobi-Bellman equations , 1986 .
[12] A. Ramsay. Formal Methods in Artificial Intelligence , 1988 .
[13] Jean-Philippe Chancelier,et al. Dynamic programming complexity and application , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.
[14] Joseph E. Flaherty,et al. Adaptive Methods for Partial Differential Equations , 1989 .
[15] C. Watkins. Learning from delayed rewards , 1989 .
[16] Chows Chee-Seng. Multigrid algorithms and complexity results for discrete-time stochastic control and related fixed-point problems , 1989 .
[17] Stephen F. McCormick,et al. Multilevel adaptive methods for partial differential equations , 1989, Frontiers in applied mathematics.
[18] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[19] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[20] A. Moore. Variable Resolution Dynamic Programming , 1991, ML.
[21] Jean-Claude Latombe,et al. Robot motion planning , 1970, The Kluwer international series in engineering and computer science.
[22] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[23] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[24] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[25] Stefan Schaal,et al. Assessing the Quality of Learned Local Models , 1993, NIPS.
[26] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[27] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[28] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[29] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
[30] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.