Approximate Planning in Large POMDPs via Reusable Trajectories
暂无分享,去创建一个
Yishay Mansour | Andrew Y. Ng | Michael Kearns | A. Ng | Y. Mansour | M. Kearns | Michael Kearns | Yishay Mansour | Andrew Y. Ng
[1] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[2] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[3] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[4] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[5] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[6] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[7] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[8] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[9] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[10] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[11] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..