暂无分享,去创建一个
David Duvenaud | Will Grathwohl | Geoffrey Roeder | Yuhuai Wu | Dami Choi | Yuhuai Wu | D. Duvenaud | Will Grathwohl | Geoffrey Roeder | Dami Choi
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.
[3] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[4] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[5] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[6] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[7] Alexander D'Amour,et al. Reducing Reparameterization Gradient Variance , 2017, NIPS.
[8] David M. Blei,et al. Overdispersed Black-Box Variational Inference , 2016, UAI.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Andriy Mnih,et al. Variational Inference for Monte Carlo Objectives , 2016, ICML.
[11] Jascha Sohl-Dickstein,et al. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.
[12] Hao Liu,et al. Sample-efficient Policy Optimization with Stein Control Variate , 2017, ArXiv.
[13] Dengyong Zhou,et al. Action-depedent Control Variates for Policy Optimization via Stein's Identity , 2017 .
[14] N. Chopin,et al. Control functionals for Monte Carlo integration , 2014, 1410.2392.
[15] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[16] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[17] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[18] B. Speelpenning. Compiling Fast Partial Derivatives of Functions Given by Algorithms , 1980 .
[19] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[20] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[21] David Barber,et al. Variational Optimization , 2012, ArXiv.
[22] Louis B. Rall,et al. Automatic Differentiation: Techniques and Applications , 1981, Lecture Notes in Computer Science.
[23] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[24] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[25] David M. Blei,et al. The Generalized Reparameterization Gradient , 2016, NIPS.
[26] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[27] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[28] Alexandre M. Bayen,et al. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.
[29] Richard E. Turner,et al. Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning , 2017, NIPS.
[30] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[31] H. Robbins. A Stochastic Approximation Method , 1951 .
[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[33] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[34] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[35] Scott W. Linderman,et al. Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms , 2016, AISTATS.
[36] Sergey Levine,et al. The Mirage of Action-Dependent Baselines in Reinforcement Learning , 2018, ICML.
[37] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines - Revised , 2015 .