Percentile queries in multi-dimensional Markov decision processes

Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function f, thresholds $$v_i$$vi (one per dimension), and probability thresholds $$\alpha _i$$αi, we show how to compute a single strategy to enforce that for all dimensions i, the probability of outcomes $$\rho $$ρ satisfying $$f_i(\rho ) \ge v_i$$fi(ρ)≥vi is at least $$\alpha _i$$αi. We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. (Log Methods Comput Sci 4(4), 2008) in unweighted MDPs.

[1]  Thomas A. Henzinger,et al.  Exact and Approximate Determinization of Discounted-Sum Automata , 2014, Log. Methods Comput. Sci..

[2]  Yoshio Ohtsubo,et al.  Optimal threshold probability in undiscounted Markov decision processes with a target set , 2004, Appl. Math. Comput..

[3]  Benjamin Monmege,et al.  To Reach or not to Reach? Efficient Algorithms for Total-Payoff Games , 2014, CONCUR.

[4]  Neil Immerman,et al.  First-Order and Temporal Logics for Nested Words , 2007, 22nd Annual IEEE Symposium on Logic in Computer Science (LICS 2007).

[5]  Yoshio Ohtsubo Minimizing risk models in stochastic shortest path problems , 2003, Math. Methods Oper. Res..

[6]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[7]  Krishnendu Chatterjee,et al.  Strategy synthesis for multi-dimensional quantitative objectives , 2012, Acta Informatica.

[8]  室 章治郎 Michael R.Garey/David S.Johnson 著, "COMPUTERS AND INTRACTABILITY A guide to the Theory of NP-Completeness", FREEMAN, A5判変形判, 338+xii, \5,217, 1979 , 1980 .

[9]  Kousha Etessami,et al.  Multi-Objective Model Checking of Markov Decision Processes , 2007, Log. Methods Comput. Sci..

[10]  Krishnendu Chatterjee,et al.  Efficient and Dynamic Algorithms for Alternating Büchi Games and Maximal End-Component Decomposition , 2014, J. ACM.

[11]  John Fearnley,et al.  Reachability in two-clock timed automata is PSPACE-Complete , 2013, ICALP 2013.

[12]  D. White Minimizing a Threshold Probability in Discounted Markov Decision Processes , 1993 .

[13]  Christel Baier,et al.  Computing Quantiles in Markov Reward Models , 2013, FoSSaCS.

[14]  Zohar Manna,et al.  Formal verification of probabilistic systems , 1997 .

[15]  Christoph Haase,et al.  The complexity of the Kth largest subset problem and related problems , 2015, Inf. Process. Lett..

[16]  Thomas A. Henzinger,et al.  The Target Discounted-Sum Problem , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[17]  Luca de Alfaro,et al.  Computing Minimum and Maximum Reachability Times in Probabilistic Systems , 1999, CONCUR.

[18]  Thomas A. Henzinger,et al.  Markov Decision Processes with Multiple Objectives , 2006, STACS.

[19]  Mathieu Tracol,et al.  Fast convergence to state-action frequency polytopes for MDPs , 2009, Oper. Res. Lett..

[20]  Stephen D. Travers The complexity of membership problems for circuits over sets of integers , 2004, Theor. Comput. Sci..

[21]  Yacov Yacobi,et al.  The Complexity of Promise Problems with Applications to Public-Key Cryptography , 1984, Inf. Control..

[22]  Yoshio Ohtsubo,et al.  Markov decision processes associated with two threshold probability criteria , 2013 .

[23]  Krishnendu Chatterjee,et al.  Multi-objective Discounted Reward Verification in Graphs and MDPs , 2013, LPAR.

[24]  Congbin Wu,et al.  Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .

[25]  Krishnendu Chatterjee,et al.  Looking at mean-payoff and total-payoff through windows , 2015, Inf. Comput..

[26]  Christoph Haase,et al.  The Odds of Staying on Budget , 2014, ICALP.

[27]  Jean-François Raskin,et al.  Quantitative games with interval objectives , 2014, FSTTCS.

[28]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[29]  D. Krass,et al.  Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..

[30]  Krishnendu Chatterjee,et al.  Probabilistic Systems with LimSup and LimInf Objectives , 2008, ILC.

[31]  Christel Baier,et al.  Energy-Utility Quantiles , 2014, NASA Formal Methods.

[32]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[33]  M. Minsky Recursive Unsolvability of Post's Problem of "Tag" and other Topics in Theory of Turing Machines , 1961 .

[34]  Véronique Bruyère,et al.  Meet Your Expectations With Guarantees: Beyond Worst-Case Synthesis in Quantitative Games , 2013, STACS.

[35]  Donald B. Johnson,et al.  Lower Bounds for Selection in X + Y and Other Multisets , 1978, JACM.

[36]  Seinosuke Toda,et al.  PP is as Hard as the Polynomial-Time Hierarchy , 1991, SIAM J. Comput..

[37]  Vladimir Gurvich,et al.  On Short Paths Interdiction Problems: Total and Node-Wise Limited Interdiction , 2008, Theory of Computing Systems.

[38]  Mickael Randour,et al.  Variations on the Stochastic Shortest Path Problem , 2014, VMCAI.

[39]  Krishnendu Chatterjee Concurrent games with tail objectives , 2007, Theor. Comput. Sci..

[40]  Oded Goldreich,et al.  On Promise Problems: A Survey , 2006, Essays in Memory of Shimon Even.

[41]  Petr Novotný,et al.  Solvency Markov Decision Processes with Interest , 2013, FSTTCS.

[42]  Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2015, LICS 2015.

[43]  Shie Mannor,et al.  Probabilistic Goal Markov Decision Processes , 2011, IJCAI.

[44]  Moshe Y. Vardi Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[45]  Krishnendu Chatterjee,et al.  Generalized Mean-payoff and Energy Games , 2010, FSTTCS.

[46]  Andrew V. Goldberg,et al.  Shortest paths algorithms: Theory and experimental evaluation , 1994, SODA '94.

[47]  Cosimo Laneve,et al.  Decidability Problems for Actor Systems , 2012, CONCUR.