A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
暂无分享,去创建一个
[1] Alfred V. Aho,et al. The Design and Analysis of Computer Algorithms , 1974 .
[2] M. A. Griffin,et al. Information Processing Systems , 1976 .
[3] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[4] Craig Boutilier,et al. Integrating Planning and Execution in Stochastic Domains , 1994, UAI.
[5] Peter Norvig,et al. A modern, agent-oriented approach to introductory artificial intelligence , 1995, SGAR.
[6] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[7] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[8] Blai Bonet,et al. A Robust and Fast Action Selection Mechanism for Planning , 1997, AAAI/IAAI.
[9] Reid G. Simmons,et al. Solving Robot Navigation Problems with Initial Pose Uncertainty Using Real-Time Heuristic Search , 1998, AIPS.
[10] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[11] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[12] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[13] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[14] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[15] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[16] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[17] David A. McAllester,et al. Approximate Planning for Factored POMDPs using Belief State Simplification , 1999, UAI.
[18] Satinder Singh,et al. An upper bound on the loss from approximate optimal-value functions , 1994, Machine Learning.