Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Running Time
暂无分享,去创建一个
[1] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[2] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[3] George B. Dantzig,et al. Linear programming and extensions , 1965 .
[4] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[5] Dimitri P. Bertsekas,et al. Abstract Dynamic Programming , 2013 .
[6] Chak-Kuen Wong,et al. An Efficient Method for Weighted Sampling Without Replacement , 1980, SIAM J. Comput..
[7] Yishay Mansour,et al. On the Complexity of Policy Iteration , 1999, UAI.
[8] R. Rubinstein,et al. An Efficient Stochastic Approximation Algorithm for Stochastic Saddle Point Problems , 2005 .
[9] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[10] Mengdi Wang,et al. Lower Bound On the Computational Complexity of Discounted Markov Decision Problems , 2017, ArXiv.
[11] Mengdi Wang,et al. Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning , 2016, ArXiv.
[12] Yin Tat Lee,et al. Efficient Inverse Maintenance and Faster Algorithms for Linear Programming , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.
[13] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[14] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[15] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.
[16] F. d'Epenoux,et al. A Probabilistic Production and Inventory Problem , 1963 .
[17] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[18] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[19] Bruno Scherrer,et al. Improved and Generalized Upper Bounds on the Complexity of Policy Iteration , 2013, Math. Oper. Res..
[20] P. Tseng. Solving H-horizon, stationary Markov decision problems in time proportional to log(H) , 1990 .
[21] Yin Tat Lee,et al. Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.
[22] Yinyu Ye,et al. A New Complexity Result on Solving the Markov Decision Problem , 2005, Math. Oper. Res..
[23] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[24] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[25] John N. Tsitsiklis,et al. Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[26] Mengdi Wang,et al. An online primal-dual method for discounted Markov decision processes , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[27] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[28] Konstantinos Panagiotou,et al. Efficient Sampling Methods for Discrete Distributions , 2012, ICALP.
[29] Eugene A. Feinberg,et al. The value iteration algorithm is not strongly polynomial for discounted dynamic programming , 2013, Oper. Res. Lett..
[30] David P. Woodruff,et al. Sublinear Optimization for Machine Learning , 2010, FOCS.
[31] Yinyu Ye,et al. The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate , 2011, Math. Oper. Res..