论文信息 - An Improved Grid-Based Approximation Algorithm for POMDPs

An Improved Grid-Based Approximation Algorithm for POMDPs

Although a partially observable Markov decision process (POMDP) provides an appealing model for problems of planning under uncertainty, exact algorithms for POMDPs are intractable. This motivates work on approximation algorithms, and grid-based approximation is a widely-used approach. We describe a novel approach to grid-based approximation that uses a variable-resolution regular grid, and show that it outperforms previous grid-based approaches to approximation.

Eric A. Hansen | Rong Zhou | E. Hansen | R. Zhou

[1] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[2] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.

[3] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .

[4] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[5] Andrew W. Moore,et al. Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.

[6] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[7] Milos Hauskrecht,et al. Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.

[8] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.

[9] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[10] Richard Washington,et al. BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning , 1997, ECP.

[11] James S. Kakalik,et al. OPTIMUM POLICIES FOR PARTIALLY OBSERVABLE MARKOV SYSTEMS , 1965 .

[12] Alvin W Drake,et al. Observation of a Markov process through a noisy channel , 1962 .

[13] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[14] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[15] Weihong Zhang,et al. A Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes , 1999, UAI.