Multi-Objective Policy Generation for Mobile Robots under Probabilistic Time-Bounded Guarantees

We present a methodology for the generation of mobile robot controllers which offer probabilistic time-bounded guarantees on successful task completion, whilst also trying to satisfy soft goals. The approach is based on a stochastic model of the robot’s environment and action execution times, a set of soft goals, and a formal task specification in co-safe linear temporal logic, which are analysed using multi-objective model checking techniques for Markov decision processes. For efficiency, we propose a novel two-step approach. First, we explore policies on the Pareto front for minimising expected task execution time whilst optimising the achievement of soft goals. Then, we use this to prune a model with more detailed timing information, yielding a time-dependent policy for which more fine-grained probabilistic guarantees can be provided. We illustrate and evaluate the generation of policies on a delivery task in a care home scenario, where the robot also tries to engage in entertainment activities with the

[1]  Calin Belta,et al.  Control of noisy differential-drive vehicles from time-bounded temporal logic specifications , 2013, ICRA.

[2]  Christel Baier,et al.  Principles of model checking , 2008 .

[3]  Florent Teichteil-Königsbuch Stochastic Safest and Shortest Path Problems , 2012, AAAI.

[4]  Manuela M. Veloso,et al.  Constrained scheduling of robot exploration tasks , 2014, AAMAS.

[5]  Moshe Y. Vardi Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[6]  Ufuk Topcu,et al.  Computational methods for stochastic control with metric interval temporal logic specifications , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[7]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[8]  Orna Kupferman,et al.  Model Checking of Safety Properties , 1999, CAV.

[9]  Peter Jonsson,et al.  Oversubscription Planning: Complexity and Compilability , 2014, AAAI.

[10]  Jie Zhang,et al.  Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method , 2017, AAAI.

[11]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[12]  Nick Hawes,et al.  Nested Value Iteration for Partially Satisfiable Co-Safe LTL Specifications (Extended Abstract) , 2015, AAAI Fall Symposia.

[13]  Andrew Coles,et al.  Temporal Planning with Preferences and Time-Dependent Continuous Costs , 2012, ICAPS.

[14]  Marta Z. Kwiatkowska,et al.  Pareto Curves for Probabilistic Model Checking , 2012, ATVA.

[15]  Y. Nie,et al.  Shortest path problem considering on-time arrival probability , 2009 .

[16]  Morteza Lahijanian,et al.  Specification revision for Markov decision processes with optimal trade-off , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[17]  Marta Z. Kwiatkowska,et al.  Automated Verification Techniques for Probabilistic Systems , 2011, SFM.

[18]  Subbarao Kambhampati,et al.  Over-Subscription Planning with Numeric Goals , 2005, IJCAI.

[19]  Robert Fitch,et al.  Provably-correct stochastic motion planning with safety constraints , 2013, 2013 IEEE International Conference on Robotics and Automation.

[20]  Pedro U. Lima,et al.  Decision-theoretic planning under uncertainty with information rewards for active cooperative perception , 2014, Autonomous Agents and Multi-Agent Systems.

[21]  Xu Chu Ding,et al.  Strategic planning under uncertainties via constrained Markov Decision Processes , 2013, 2013 IEEE International Conference on Robotics and Automation.

[22]  Nick Hawes,et al.  Optimal Policy Generation for Partially Satisfiable Co-Safe LTL Specifications , 2015, IJCAI.

[23]  A. Pnueli The Temporal Semantics of Concurrent Programs , 1979, Theor. Comput. Sci..

[24]  Mausam,et al.  A Theory of Goal-Oriented MDPs with Dead Ends , 2012, UAI.

[25]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[26]  Calin Belta,et al.  Optimal Control of Markov Decision Processes With Linear Temporal Logic Constraints , 2014, IEEE Transactions on Automatic Control.

[27]  Tom Duckett,et al.  Lifelong Information-Driven Exploration to Complete and Refine 4-D Spatio-Temporal Maps , 2016, IEEE Robotics and Automation Letters.

[28]  Ufuk Topcu,et al.  Robust control of uncertain Markov Decision Processes with temporal logic specifications , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).