暂无分享,去创建一个
[1] Wotao Yin,et al. More Iterations per Second, Same Quality - Why Asynchronous Algorithms may Drastically Outperform Traditional Ones , 2017, ArXiv.
[2] Lin F. Yang,et al. Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model , 2018, 1806.01492.
[3] Mengdi Wang,et al. Sample-Optimal Parametric Q-Learning with Linear Transition Models , 2019, ICML 2019.
[4] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[5] Xian Wu,et al. Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model , 2018, NeurIPS.
[6] Mengdi Wang,et al. Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear Running Time , 2017, ArXiv.
[7] Pieter Abbeel,et al. Asynchronous Methods for Model-Based Reinforcement Learning , 2019, CoRL.
[8] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[9] Xian Wu,et al. Variance reduced value iteration and faster algorithms for solving Markov decision processes , 2017, SODA.
[10] Wotao Yin,et al. On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms , 2016, J. Sci. Comput..
[11] Pieter Abbeel,et al. Accelerated Methods for Deep Reinforcement Learning , 2018, ArXiv.
[12] Lin F. Yang,et al. On the Optimality of Sparse Model-Based Planning for Markov Decision Processes , 2019, ArXiv.
[13] Ming Yan,et al. ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates , 2015, SIAM J. Sci. Comput..
[14] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[15] Hilbert J. Kappen,et al. On the Sample Complexity of Reinforcement Learning with a Generative Model , 2012, ICML.
[16] Mengdi Wang,et al. Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound , 2019, ICML.
[17] M. Kosorok,et al. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine , 2015 .
[18] Daniel Kudenko,et al. Parallel reinforcement learning with linear function approximation , 2007, AAMAS '07.
[19] Lihong Li,et al. Scalable Bilinear π Learning Using State and Action Features , 2018, ICML 2018.
[20] Hamid Reza Feyzmahdavian,et al. On the convergence rates of asynchronous iterations , 2014, 53rd IEEE Conference on Decision and Control.
[21] I. Pinelis. On inequalities for sums of bounded random variables , 2006, math/0603030.
[22] John N. Tsitsiklis,et al. Some aspects of parallel and distributed iterative algorithms - A survey, , 1991, Autom..
[23] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[25] Vivek S. Borkar,et al. Empirical Q-Value Iteration , 2014, Stochastic Systems.
[26] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[27] Mengdi Wang,et al. Sample-Optimal Parametric Q-Learning Using Linearly Additive Features , 2019, ICML.
[28] Mengdi Wang,et al. Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Running Time , 2017, 1704.01869.
[29] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[30] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[31] Stephen Tyree,et al. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.
[32] Hilbert J. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[33] Yuxi Li,et al. Deep Reinforcement Learning: An Overview , 2017, ArXiv.
[34] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[35] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[36] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[37] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .