论文信息 - Approximate planning and verification for large markov decision processes

Approximate planning and verification for large markov decision processes

We study the planning and verification problems for very large or infinite probabilistic systems, like Markov Decision Processes (MDPs), from a complexity point of view. More precisely, we deal with the problem of designing an efficient approximation method to compute a near-optimal policy for the planning problem of MDPs and the satisfaction probabilities of interesting properties like reachability or safety, over the Markov chain obtained by restricting the MDP to the near-optimal policy. The complexity of the approximation method is independent of the size of the state space and uses only a probabilistic generator of the MDP.

Richard Lassaigne | Sylvain Peyronnet

[1] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3] Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[4] Thomas Hérault,et al. Approximate Probabilistic Model Checking , 2004, VMCAI.

[5] Yinyu Ye,et al. A New Complexity Result on Solving the Markov Decision Problem , 2005, Math. Oper. Res..

[6] Thomas Hérault,et al. Distribution, Approximation and Probabilistic Model Checking , 2006, PDMC@ICALP.

[7] Richard Lassaigne,et al. Probabilistic verification and approximation , 2008, Ann. Pure Appl. Log..

[8] Nancy A. Lynch,et al. Probabilistic Simulations for Probabilistic Processes , 1994, Nord. J. Comput..

[9] Michel de Rougemont,et al. Statistic Analysis for Probabilistic Processes , 2009, 2009 24th Annual IEEE Symposium on Logic In Computer Science.

[10] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[11] Sanjeev Arora,et al. The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[12] Dimitri P. Bertsekas,et al. Rollout Algorithms for Stochastic Scheduling Problems , 1999, J. Heuristics.

[13] Richard M. Karp,et al. Monte-Carlo algorithms for enumeration and reliability problems , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[14] Andrew Hinton,et al. PRISM: A Tool for Automatic Verification of Probabilistic Systems , 2006, TACAS.

[15] Khaled Hamidouche,et al. Three High Performance Architectures in the Parallel APMC Boat , 2010, 2010 Ninth International Workshop on Parallel and Distributed Methods in Verification, and Second International Workshop on High Performance Computational Systems Biology.

[16] Steven I. Marcus,et al. A survey of some simulation-based algorithms for Markov decision processes , 2007, Commun. Inf. Syst..

[17] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[18] Thomas A. Henzinger,et al. Reactive Modules , 1999, Formal Methods Syst. Des..

[19] Mihalis Yannakakis,et al. The complexity of probabilistic verification , 1995, JACM.

[20] Oliver Friedmann,et al. An Exponential Lower Bound for the Parity Game Strategy Improvement Algorithm as We Know it , 2009, 2009 24th Annual IEEE Symposium on Logic In Computer Science.

[21] Luca de Alfaro,et al. Symbolic Model Checking of Probabilistic Processes Using MTBDDs and the Kronecker Representation , 2000, TACAS.

[22] Axel Legay,et al. Lightweight Monte Carlo Algorithm for Markov Decision Processes , 2013, ArXiv.

[23] Frédéric Magniez,et al. Approximate Satisfiability and Equivalence , 2006, 21st Annual IEEE Symposium on Logic in Computer Science (LICS'06).

[24] Thomas Hérault,et al. APMC 3.0: Approximate Verification of Discrete and Continuous Time Markov Chains , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[25] John Fearnley,et al. Exponential Lower Bounds for Policy Iteration , 2010, ICALP.

[26] Marta Z. Kwiatkowska,et al. PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[27] Andrea Bianco,et al. Model Checking of Probabalistic and Nondeterministic Systems , 1995, FSTTCS.

[28] Marta Kwiatkowska,et al. PRISM 2.0: a tool for probabilistic model checking , 2004 .

[29] Edmund M. Clarke,et al. Statistical Model Checking for Markov Decision Processes , 2012, 2012 Ninth International Conference on Quantitative Evaluation of Systems.

[30] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[31] Peter Bro Miltersen,et al. Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor , 2010, JACM.

[32] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .