Stochastic Local Search for POMDP Controllers
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[3] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[4] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[5] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[6] Holger H. Hoos,et al. Stochastic Local Search-Methods , 1998 .
[7] Chelsea C. White,et al. A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..
[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[9] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[10] Fred W. Glover,et al. Tabu Search - Part I , 1989, INFORMS J. Comput..
[11] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[12] Holger H. Hoos,et al. Stochastic local search - methods, models, applications , 1998, DISKI.
[13] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[14] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[15] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[16] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[17] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[18] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[19] Wenju Liu,et al. Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .
[20] Fred Glover,et al. Tabu Search - Part II , 1989, INFORMS J. Comput..
[21] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[22] C. White,et al. Application of Jensen's inequality to adaptive suboptimal design , 1980 .
[23] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[24] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[25] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[26] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[27] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[28] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[29] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[30] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[31] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[32] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[33] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[34] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[35] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[36] N. Zhang,et al. Algorithms for partially observable markov decision processes , 2001 .
[37] Hector Geffner,et al. Solving Large POMDPs using Real Time Dynamic Programming , 1998 .
[38] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[39] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.