High-Level Planning and Control with Incomplete Information Using POMDP's

We develop an approach to planning with incomplete information that is based on three elements: 1. a high-level language for describing the effects of actions on both the world and the agent’s beliefs that we call POMDP theories 2. a semantics that translates such theories into actual POMDPs 3. a real time dynamic programming algorithm that produces controllers from such POMDPs. We show that the resulting approach is not only clean and general but that is practical as well. We have implemented a shell that accepts POMDP theories and produces controllers, and have tested it over a number of problems. In this paper we present the main elements of the approach and report results for the ’omelette problem’ where the resulting controller exhibits a better performance than the handcrafted controller.

[1]  Thomas Hedley Bruce Burrough An approach to planning , 1953 .

[2]  R. Bellman Dynamic programming. , 1957, Science.

[3]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[4]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[5]  Robert C. Moore A Formal Theory of Knowledge and Action , 1984 .

[6]  Robert C. Moore,et al.  Formal Theories of the Commonsense World , 1985 .

[7]  Y. Shoham What is the frame problem , 1987 .

[8]  李幼升,et al.  Ph , 1989 .

[9]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[10]  Richard E. Korf,et al.  Real-Time Heuristic Search , 1990, Artif. Intell..

[11]  Raymond Reiter,et al.  The Frame Problem in the Situation Calculus: A Simple Solution (Sometimes) and a Completeness Result for Goal Regression , 1991, Artificial and Mathematical Theory of Computation.

[12]  Michael P. Wellman,et al.  Planning and Control , 1991 .

[13]  Oren Etzioni,et al.  An Approach to Planning with Incomplete Information , 1992, KR.

[14]  Gerd Infanger Planning under uncertainty , 1992 .

[15]  Michael Gelfond,et al.  Representing Action and Change by Logic Programs , 1993, J. Log. Program..

[16]  Hector J. Levesque,et al.  The Frame Problem and Knowledge-Producing Actions , 1993, AAAI.

[17]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[18]  Daniel S. Weld,et al.  Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[19]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[20]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[21]  Gregg Collins,et al.  Planning Under Uncertainty: Some Key Issues , 1995, IJCAI.

[22]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[23]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[24]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[25]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[26]  Hector J. Levesque,et al.  What Is Planning in the Presence of Sensing? , 1996, AAAI/IAAI, Vol. 2.

[27]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[28]  T. Dean,et al.  Planning under uncertainty: structural assumptions and computational leverage , 1996 .

[29]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.