论文信息 - Bounded Finite State Controllers

Bounded Finite State Controllers

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic finite state controllers, combining several advantages of gradient ascent (efficiency, search through restricted controller space) and policy iteration (less vulnerability to local optima).

Craig Boutilier | Pascal Poupart | Craig Boutilier | P. Poupart

[1] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[2] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[3] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[4] Zhengzhu Feng,et al. Approximate Planning for Factored POMDPs , 2001 .

[5] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[6] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[7] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.

[8] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.

[9] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[10] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.

[11] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .

[12] Hsien-Te Cheng,et al. Algorithms for partially observable markov decision processes , 1989 .

[13] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..

[14] Craig Boutilier,et al. Stochastic Local Search for POMDP Controllers , 2004, AAAI.

[15] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.