I-dual: Solving Constrained SSPs via Heuristic Search in the Dual Space

We consider the problem of generating optimal stochastic policies for Constrained Stochastic Shortest Path problems, which are a natural model for planning under uncertainty for resourcebounded agents with multiple competing objectives. While unconstrained SSPs enjoy a multitude of efficient heuristic search solution methods with the ability to focus on promising areas reachable from the initial state, the state of the art for constrained SSPs revolves around linear and dynamic programming algorithms which explore the entire state space. In this paper, we present i-dual, the first heuristic search algorithm for constrained SSPs. To concisely represent constraints and efficiently decide their violation, i-dual operates in the space of dual variables describing the policy occupation measures. It does so while retaining the ability to use standard value function heuristics computed by well-known methods. Our experiments show that these features enable i-dual to achieve up to two orders of magnitude improvement in run-time and memory over linear programming algorithms.

[1]  Florent Teichteil-Königsbuch Path-Constrained Markov Decision Processes: bridging the gap between probabilistic model-checking and decision-theoretic planning , 2012, ECAI.

[2]  Edmund H. Durfee,et al.  Stationary Deterministic Policies for Constrained MDPs with Multiple Rewards, Costs, and Discount Factors , 2005, IJCAI.

[3]  S. Cragg Costs , 2008, The Employment Tribunals Handbook: Practice, Procedure and Strategies for Success.

[4]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[5]  D. Bryce 6th International Planning Competition: Uncertainty Part , 2008 .

[6]  E. Altman Constrained Markov Decision Processes , 1999 .

[7]  Sylvie Thiébaux,et al.  RAO*: An Algorithm for Chance-Constrained POMDP's , 2016, AAAI.

[8]  Manuela M. Veloso,et al.  Short-Sighted Stochastic Shortest Path Problems , 2012, ICAPS.

[9]  Marta Z. Kwiatkowska,et al.  Automated Verification and Strategy Synthesis for Probabilistic Systems , 2013, ATVA.

[10]  Jonathan P. How,et al.  An online algorithm for constrained POMDPs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[11]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[12]  Christel Baier,et al.  Controller Synthesis for Probabilistic Systems , 2004, IFIP TCS.

[13]  Kathleen Daly,et al.  Volume 7 , 1998 .

[14]  Andrey Kolobov,et al.  Saturated Path-Constrained MDP: Planning under Uncertainty and Deterministic Model-Checking Constraints , 2014, AAAI.

[15]  Sylvie Thiébaux,et al.  Heuristic Search in Dual Space for Constrained Stochastic Shortest Path Problems , 2016, ICAPS.