Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes

Most exact algorithms for general partially observable Markov decision processes (POMDPs) use a form of dynamic programming in which a piecewise-linear and convex representation of one value function is transformed into another. We examine variations of the "incremental pruning" method for solving this problem and compare them to earlier algorithms from theoretical and empirical perspectives. We find that incremental pruning is presently the most efficient exact method for solving POMDPS.

[1]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[2]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[3]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[4]  K. Sawaki,et al.  OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON , 1978 .

[5]  Chelsea C. White,et al.  Solution Procedures for Partially Observed Markov Decision Processes , 1989, Oper. Res..

[6]  Chelsea C. White,et al.  A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..

[7]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[8]  Andrew McCallum,et al.  Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[9]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[10]  Eric A. Hansen,et al.  Cost-Effective Sensing during Plan Execution , 1994, AAAI.

[11]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[12]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[13]  Stuart J. Russell,et al.  Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.

[14]  M. Littman,et al.  Efficient dynamic-programming updates in partially observable Markov decision processes , 1995 .

[15]  Wenju Liu,et al.  Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .

[16]  Craig Boutilier,et al.  Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[17]  Richard Washington,et al.  Incremental Markov-model planning , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.