A Framework for Planning in Continuous-time Stochastic Domains

We propose a framework for policy generation in continuous-time stochastic domains with concurrent actions and events of uncertain duration. We make no assumptions regarding the complexity of the domain dynamics, and our planning algorithm can be used to generate policies for any discrete event system that can be simulated. We use the continuous stochastic logic (CSL) as a formalism for expressing temporally extended probabilistic goals and have developed a probabilistic anytime algorithm for verifying plans in our framework. We present an efficient procedure for comparing two plans that can be used in a hill-climbing search for a goal-satisfying plan. Our planning framework falls into the Generate, Test and Debug paradigm, and we propose a transformational approach to plan generation. This relies on effective analysis and debugging of unsatisfactory plans. Discrete event systems are naturally modeled as generalized semi-Markov processes (GSMPs). We adopt the GSMP as the basis for our planning framework, and present preliminary work on a domain independent approach to plan debugging that utilizes information from the verification phase.

[1]  John L. Bresina,et al.  Anytime Synthetic Projection: Maximizing the Probability of Goal Satisfaction , 1990, AAAI.

[2]  Marco Pistore,et al.  Planning as Model Checking for Extended Goals in Non-deterministic Domains , 2001, IJCAI.

[3]  Jaroslav Kožešnk,et al.  Information Theory, Statistical Decision Functions, Random Processes , 1962 .

[4]  Sridhar Mahadevan,et al.  Decision-Theoretic Planning with Concurrent Temporally Extended Actions , 2001, UAI.

[5]  David E. Smith,et al.  Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI , 2002, AIPS Workshop on Planning for Temporal Domains.

[6]  D. Levandier,et al.  Approved for Public Release; Distribution Unlimited , 1994 .

[7]  E BryantRandal Graph-Based Algorithms for Boolean Function Manipulation , 1986 .

[8]  David J. Musliner,et al.  Toward Decision-Theoretic CIRCA with Application to Real-Time Computer Security Control , 2002 .

[9]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[10]  Mark S. Boddy,et al.  An Analysis of Time-Dependent Planning , 1988, AAAI.

[11]  Bengt Jonsson,et al.  A logic for reasoning about time and reliability , 1990, Formal Aspects of Computing.

[12]  S. Balemi,et al.  Supervisory control of a rapid thermal multiprocessor , 1993, IEEE Trans. Autom. Control..

[13]  Robert K. Brayton,et al.  Model-checking continuous-time Markov chains , 2000, TOCL.

[14]  Rajeev Alur,et al.  Model-Checking in Dense Real-time , 1993, Inf. Comput..

[15]  Håkan L. S. Younes,et al.  Probabilistic Verification of Discrete Event Systems Using Acceptance Sampling , 2002, CAV.

[16]  Paolo Traverso,et al.  Automatic OBDD-Based Generation of Universal Plans in Non-Deterministic Domains , 1998, AAAI/IAAI.

[17]  James F. Allen,et al.  Improving Big Plans , 1998, AAAI/IAAI.

[18]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[19]  Daniel S. Weld,et al.  Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[20]  David J. Musliner,et al.  World Modeling for the Dynamic Construction of Real-Time Control Plans , 1995, Artif. Intell..

[21]  Matthew L. Ginsberg,et al.  Universal Planning: An (Almost) Universally Bad Idea , 1989, AI Mag..

[22]  Yishay Mansour,et al.  Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.

[23]  Michel Barbeau,et al.  Planning Control Rules for Reactive Agents , 1997, Artif. Intell..

[24]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[25]  Joost-Pieter Katoen,et al.  Beyond Memoryless Distributions: Model Checking Semi-Markov Chains , 2001, PAPM-PROBMIV.

[26]  Robert P. Goldman,et al.  Epsilon-Safe Planning , 1994, UAI.

[27]  Håkan L. S. Younes,et al.  Solving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions , 2004, AAAI.

[28]  Christel Baier,et al.  Approximate Symbolic Model Checking of Continuous-Time Markov Chains , 1999, CONCUR.

[29]  Reid G. Simmons,et al.  A Theory of Debugging Plans and Interpretations , 1988, AAAI.

[30]  Christopher M. Bishop,et al.  UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence , 2003 .

[31]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[32]  P. Glynn A GSMP formalism for discrete event systems , 1989, Proc. IEEE.

[33]  Jim Blythe,et al.  Planning with External Events , 1994, UAI.

[34]  Manuela M. Veloso,et al.  OBDD-based Universal Planning for Synchronized Agents in Non-Deterministic Domains , 2000, J. Artif. Intell. Res..

[35]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[36]  Sebastian Thrun,et al.  Monte Carlo POMDPs , 1999, NIPS.

[37]  Gerald S. Shedler Regenerative Stochastic Simulation , 1992 .

[38]  Ronald A. Howard,et al.  Dynamic Probabilistic Systems , 1971 .

[39]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.