Stochastic Variance-Reduced Policy Gradient
暂无分享,去创建一个
Marcello Restelli | Matteo Pirotta | Matteo Papini | Damiano Binaghi | Giuseppe Canonaco | Matteo Pirotta | Marcello Restelli | M. Papini | Damiano Binaghi | Giuseppe Canonaco
[1] Mark W. Schmidt,et al. StopWasting My Gradients: Practical SVRG , 2015, NIPS.
[2] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[3] Julien Mairal,et al. Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure , 2016, NIPS.
[4] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[5] Reuven Y. Rubinstein,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.
[6] Lihong Li,et al. Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.
[7] Jian Peng,et al. Stochastic Variance Reduction for Policy Gradient Estimation , 2017, ArXiv.
[8] David Barber,et al. A Unifying Perspective of Parametric Policy Search Methods for Markov Decision Processes , 2012, NIPS.
[9] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[10] Zeyuan Allen Zhu,et al. Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.
[11] Francis R. Bach,et al. Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.
[12] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[13] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[14] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
[15] Yishay Mansour,et al. Learning Bounds for Importance Weighting , 2010, NIPS.
[16] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[17] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[18] Luca Bascetta,et al. Adaptive Step-Size for Policy Gradient Methods , 2013, NIPS.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Philip S. Thomas,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines , 2017, ArXiv.
[21] H. Robbins. A Stochastic Approximation Method , 1951 .
[22] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[23] Jie Liu,et al. Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting , 2015, IEEE Journal of Selected Topics in Signal Processing.
[24] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[25] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[26] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[27] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.
[28] Marcello Restelli,et al. Adaptive Batch Size for Safe Policy Gradients , 2017, NIPS.
[29] Masashi Sugiyama,et al. Analysis and Improvement of Policy Gradient Estimation , 2011 .
[30] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[31] Luca Bascetta,et al. Policy gradient in Lipschitz Markov Decision Processes , 2015, Machine Learning.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Gang Niu,et al. Analysis and Improvement of Policy Gradient Estimation , 2011, NIPS.
[34] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[35] Julien Mairal,et al. Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning , 2014, SIAM J. Optim..
[36] Justin Domke,et al. Finito: A faster, permutable incremental gradient method for big data problems , 2014, ICML.
[37] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[38] Alexander J. Smola,et al. Fast Incremental Method for Nonconvex Optimization , 2016, ArXiv.
[39] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[40] Filip Jurčı́ček,et al. Reinforcement learning for spoken dialogue systems using off-policy natural gradient method , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).