Planning and Control in Artificial Intelligence: A Unifying Perspective

The problem of selecting actions in environments that are dynamic and not completely predictable or observable is a central problem in intelligent behavior. In AI, this translates into the problem of designing controllers that can map sequences of observations into actions so that certain goals are achieved. Three main approaches have been used in AI for designing such controllers: the programming approach, where the controller is programmed by hand in a suitable high-level procedural language, the planning approach, where the control is automatically derived from a suitable description of actions and goals, and the learning approach, where the control is derived from a collection of experiences. The three approaches exhibit successes and limitations. The focus of this paper is on the planning approach. More specifically, we present an approach to planning based on various state models that handle various types of action dynamics (deterministic and probabilistic) and sensor feedback (null, partial, and complete). The approach combines high-level representations languages for describing actions, sensors, and goals, mathematical models of sequential decisions for making precise the various planning tasks and their solutions, and heuristic search algorithms for computing those solutions. The approach is supported by a computational tool we have developed that accepts high-level descriptions of actions, sensors, and goals and produces suitable controllers. We also present empirical results and discuss open challenges.

[1]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[2]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[3]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4]  Michael P. Wellman,et al.  Planning and Control , 1991 .

[5]  Bart Selman,et al.  Pushing the Envelope: Planning, Propositional Logic and Stochastic Search , 1996, AAAI/IAAI, Vol. 2.

[6]  Hector Geffner,et al.  Solving Large POMDPs using Real Time Dynamic Programming , 1998 .

[7]  Óc XCn Heuristic Search in Cyclic AND/OR Graphs , 1998 .

[8]  Richard Washington,et al.  BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning , 1997, ECP.

[9]  Blai Bonet High-Level Planning and Control with Incomplete Information Using POMDPs Hdctor Geffner and , 2003 .

[10]  E. Sandewall Features and fluents (vol. 1): the representation of knowledge about dynamical systems , 1995 .

[11]  Raymond Reiter,et al.  The Frame Problem in the Situation Calculus: A Simple Solution (Sometimes) and a Completeness Result for Goal Regression , 1991, Artificial and Mathematical Theory of Computation.

[12]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[13]  Jacques Wainer,et al.  Modeling Action, Knowledge and Control , 1998, ECAI.

[14]  Edwin P. D. Pednault,et al.  ADL: Exploring the Middle Ground Between STRIPS and the Situation Calculus , 1989, KR.

[15]  Maria Fox,et al.  The Detection and Exploitation of Symmetry in Planning Problems , 1999, IJCAI.

[16]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[17]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[18]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[19]  David Chapman,et al.  What are plans for? , 1990, Robotics Auton. Syst..

[20]  Daniel S. Weld,et al.  Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[21]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[22]  David E. Smith,et al.  Conformant Graphplan , 1998, AAAI/IAAI.

[23]  Eric A. Hansen,et al.  Solving POMDPs by Searching in Policy Space , 1998, UAI.

[24]  Reid G. Simmons,et al.  Real-Time Search in Non-Deterministic Domains , 1995, IJCAI.

[25]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[26]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[27]  Fausto Giunchiglia,et al.  Planning as Model Checking , 1999, ECP.

[28]  Blai Bonet,et al.  A Robust and Fast Action Selection Mechanism for Planning , 1997, AAAI/IAAI.

[29]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[30]  R. Bellman Dynamic programming. , 1957, Science.

[31]  Hector Geffner Modelling Intelligent Behaviour: The Markov Decision Process Approach , 1998, IBERAMIA.

[32]  Richard E. Korf,et al.  Finding Optimal Solutions to Rubik's Cube Using Pattern Databases , 1997, AAAI/IAAI.

[33]  Richard E. Korf,et al.  Real-Time Heuristic Search , 1990, Artif. Intell..

[34]  Nils J. Nilsson,et al.  Teleo-Reactive Programs for Agent Control , 1993, J. Artif. Intell. Res..

[35]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[36]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[37]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[38]  T. Dean,et al.  Planning under uncertainty: structural assumptions and computational leverage , 1996 .

[39]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[40]  Gregg Collins,et al.  Planning Under Uncertainty: Some Key Issues , 1995, IJCAI.

[41]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[42]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[43]  Milos Hauskrecht,et al.  Planning and control in stochastic domains with imperfect information , 1997 .

[44]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[45]  David E. Smith,et al.  Extending Graphplan to handle uncertainty and sensing actions , 1998, AAAI 1998.

[46]  Matthias Heger,et al.  Consideration of Risk in Reinforcement Learning , 1994, ICML.

[47]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[48]  Omid Omidvar,et al.  Neural Networks and Pattern Recognition , 1997 .

[49]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Hector J. Levesque,et al.  What Is Planning in the Presence of Sensing? , 1996, AAAI/IAAI, Vol. 2.

[51]  Louis Padulo System theory , 1974 .

[52]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[53]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[54]  Marco Roveri,et al.  Recent Advances in AI Planning , 1999, Lecture Notes in Computer Science.

[55]  Anil K. Jain,et al.  Neural networks and pattern recognition , 1994 .

[56]  Erik Sandewall Features and fluents : representation of knowledge about dynamical systems , 1994 .

[57]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[58]  Blai Bonet,et al.  Planning as Heuristic Search: New Results , 1999, ECP.

[59]  Michael Gelfond,et al.  Representing Action and Change by Logic Programs , 1993, J. Log. Program..

[60]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[61]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[62]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[63]  Blai Bonet,et al.  Learning Sorting and Decision Trees with POMDPs , 1998, ICML.