Optimal Control of Markov Decision Processes with Temporal Logic Constraints

In this paper, we develop a method to automatically generate a control policy for a dynamical system modeled as a Markov Decision Process (MDP). The control specification is given as a Linear Temporal Logic (LTL) formula over a set of propositions defined on the states of the MDP. We synthesize a control policy maximizing the probability that the MDP satisfies the given specification. In addition, we designate an “optimizing proposition” to be repeatedly satisfied, and we formulate a novel optimization criterion in terms of minimizing the expected cost in between satisfactions of this proposition. We propose a sufficient condition for a policy to be optimal, and develop a dynamic programming algorithm that synthesizes a policy that is optimal for a set of LTL specifications. This problem is motivated by robotic applications requiring persistent tasks, such as environmental monitoring or data gathering, to be performed.

[1]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Vol. II , 1976 .

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4]  Zohar Manna,et al.  Formal verification of probabilistic systems , 1997 .

[5]  Markov decision processes and regular events , 1998, IEEE Trans. Autom. Control..

[6]  Moshe Y. Vardi Probabilistic Linear-Time Model Checking: An Overview of the Automata-Theoretic Approach , 1999, ARTS.

[7]  Stephan Merz,et al.  Model Checking , 2000 .

[8]  Thomas Wilke,et al.  Automata logics, and infinite games: a guide to current research , 2002 .

[9]  Christel Baier,et al.  Controller Synthesis for Probabilistic Systems , 2004, IFIP TCS.

[10]  K.J. Kyriakopoulos,et al.  Automatic synthesis of multi-agent motion tasks based on LTL specifications , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[11]  C. Baier,et al.  Experiments with Deterministic ω-Automata for Formulas of Linear Temporal Logic , 2005 .

[12]  L. Hogben Handbook of Linear Algebra , 2006 .

[13]  Hadas Kress-Gazit,et al.  Where's Waldo? Sensor-Based Temporal Logic Motion Planning , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[14]  Thierry Siméon,et al.  The Stochastic Motion Roadmap: A Sampling Framework for Planning with Markov Motion Uncertainty , 2007, Robotics: Science and Systems.

[15]  Calin Belta,et al.  A Fully Automated Framework for Control of Linear Systems from Temporal Logic Specifications , 2008, IEEE Transactions on Automatic Control.

[16]  Christel Baier,et al.  Principles of model checking , 2008 .

[17]  Emilio Frazzoli,et al.  Sampling-based motion planning with deterministic μ-calculus specifications , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[18]  Leslie Pack Kaelbling,et al.  Collision Avoidance for Unmanned Aircraft using Markov Decision Processes , 2010 .

[19]  Calin Belta,et al.  LTL Control in Uncertain Environments with Probabilistic Satisfaction Guarantees , 2011, ArXiv.

[20]  Calin Belta,et al.  Optimal path planning for surveillance with temporal-logic constraints* , 2011, Int. J. Robotics Res..

[21]  Calin Belta,et al.  MDP optimal control under temporal logic constraints , 2011, IEEE Conference on Decision and Control and European Control Conference.

[22]  Amir Pnueli,et al.  Synthesis of Reactive(1) designs , 2006, J. Comput. Syst. Sci..

[23]  Ufuk Topcu,et al.  Receding Horizon Temporal Logic Planning , 2012, IEEE Transactions on Automatic Control.

[24]  Calin Belta,et al.  Temporal Logic Control of Discrete-Time Piecewise Affine Systems , 2012, IEEE Transactions on Automatic Control.

[25]  Calin Belta,et al.  Temporal Logic Motion Planning and Control With Probabilistic Satisfaction Guarantees , 2012, IEEE Transactions on Robotics.