Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration
暂无分享,去创建一个
Marianne Akian | Zheng Qu | St'ephane Gaubert | Omar Saadi | S. Gaubert | M. Akian | Zheng Qu | Omar Saadi
[1] S. Gaubert,et al. Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial , 2013, 1310.4953.
[2] Yu. S. Ledyaev,et al. Nonsmooth analysis and control theory , 1998 .
[3] S. Lippman,et al. Stochastic Games with Perfect Information and Time Average Payoff , 1969 .
[4] S. Gaubert,et al. A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones , 2011, 1112.5968.
[5] Mengdi Wang,et al. Primal-Dual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems , 2017, ArXiv.
[6] Peter W. Glynn,et al. An empirical algorithm for relative value iteration for average-cost MDPs , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).
[7] Xian Wu,et al. Variance reduced value iteration and faster algorithms for solving Markov decision processes , 2017, SODA.
[8] Lin F. Yang,et al. Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model , 2018, 1806.01492.
[9] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[10] John Mallet-Paret,et al. Eigenvalues for a class of homogeneous cone maps arising from max-plus operators , 2002 .
[11] Sylvain Sorin,et al. Stochastic Games and Applications , 2003 .
[12] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[13] Peter Whittle,et al. Optimization Over Time , 1982 .
[14] S. Gaubert,et al. Generic uniqueness of the bias vector of finite stochastic games with perfect information , 2016, 1610.09651.
[15] E. Dynkin. BOUNDARY THEORY OF MARKOV PROCESSES (THE DISCRETE CASE) , 1969 .
[16] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .