POMDP solution methods
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[3] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[4] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[5] C. White,et al. Application of Jensen's inequality to adaptive suboptimal design , 1980 .
[6] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[7] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[8] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[11] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[12] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[13] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[14] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[15] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[16] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[17] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[18] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[19] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[20] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[21] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[22] N. Zhang,et al. Algorithms for partially observable markov decision processes , 2001 .
[23] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[24] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.