Credit Assignment Techniques in Stochastic Computation Graphs
暂无分享,去创建一个
David Silver | Nicolas Heess | Lars Buesing | Théophane Weber | D. Silver | N. Heess | T. Weber | Lars Buesing | David Silver
[1] Paul J. Werbos,et al. Applications of advances in nonlinear sensitivity analysis , 1982 .
[2] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[3] Dan Geiger,et al. Identifying independence in bayesian networks , 1990, Networks.
[4] Paul Glasserman,et al. Gradient Estimation Via Perturbation Analysis , 1990 .
[5] Paul Glasserman,et al. Smoothing complements and randomized score functions , 1992, Ann. Oper. Res..
[6] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[9] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[10] H. Robbins. A Stochastic Approximation Method , 1951 .
[11] Jürgen Schmidhuber,et al. Policy Gradient Critics , 2007, ECML.
[12] Uwe Naumann,et al. Optimal Jacobian accumulation is NP-complete , 2007, Math. Program..
[13] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[14] Mark W. Schmidt,et al. Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.
[15] Michael Fairbank,et al. Value-gradient learning , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[16] Michael I. Jordan,et al. Variational Bayesian Inference with Stochastic Search , 2012, ICML.
[17] David Wingate,et al. Automated Variational Inference in Probabilistic Programming , 2013, ArXiv.
[18] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[19] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[20] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[21] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[22] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[23] Max Welling,et al. GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation , 2014, UAI.
[24] Noah D. Goodman,et al. Amortized Inference in Probabilistic Reasoning , 2014, CogSci.
[25] Max Welling,et al. Efficient Gradient-Based Inference through Transformations between Bayes Nets and Neural Nets , 2014, ICML.
[26] Christian Osendorfer,et al. Learning Stochastic Recurrent Networks , 2014, NIPS 2014.
[27] Zhe Gan,et al. Deep Temporal Sigmoid Belief Networks for Sequence Modeling , 2015, NIPS.
[28] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[29] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[30] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[31] Uri Shalit,et al. Deep Kalman Filters , 2015, ArXiv.
[32] David Silver,et al. Reinforced Variational Inference , 2015, NIPS 2015.
[33] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[34] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[35] Andriy Mnih,et al. Variational Inference for Monte Carlo Objectives , 2016, ICML.
[36] Geoffrey E. Hinton,et al. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.
[37] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.
[38] Il Memming Park,et al. BLACK BOX VARIATIONAL INFERENCE FOR STATE SPACE MODELS , 2015, 1511.07367.
[39] Sergey Levine,et al. MuProp: Unbiased Backpropagation for Stochastic Neural Networks , 2015, ICLR.
[40] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[41] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[42] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[43] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[44] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[45] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[46] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[47] Razvan Pascanu,et al. Sobolev Training for Neural Networks , 2017, NIPS.
[48] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[49] Hao Liu,et al. Efficient Structured Inference for Stochastic Recurrent Neural Networks , 2017 .
[50] Jascha Sohl-Dickstein,et al. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.
[51] A. Doucet,et al. Controlled sequential Monte Carlo , 2017, The Annals of Statistics.
[52] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[53] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[54] Alexandre M. Bayen,et al. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.
[55] Paavo Parmas,et al. Total stochastic gradient algorithms and applications in reinforcement learning , 2019, NeurIPS.
[56] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[57] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[58] Hanning Zhou,et al. Backprop-Q: Generalized Backpropagation for Stochastic Computation Graphs , 2018, ArXiv.
[59] David Duvenaud,et al. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.
[60] N. Heess,et al. Neural belief states for partially observed domains , 2018 .
[61] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[62] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[63] Yee Whye Teh,et al. Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects , 2018, NeurIPS.
[64] Sergey Levine,et al. The Mirage of Action-Dependent Baselines in Reinforcement Learning , 2018, ICML.
[65] Shakir Mohamed,et al. Implicit Reparameterization Gradients , 2018, NeurIPS.
[66] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[67] Christopher C. Drovandi,et al. Variational Bayes with synthetic likelihood , 2016, Statistics and Computing.
[68] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[69] Karol Gregor,et al. Temporal Difference Variational Auto-Encoder , 2018, ICLR.
[70] Sepp Hochreiter,et al. RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.
[71] Yoshua Bengio,et al. Probabilistic Planning with Sequential Monte Carlo methods , 2018, ICLR.