Gradient Estimation Using Stochastic Computation Graphs
暂无分享,去创建一个
Pieter Abbeel | Nicolas Heess | John Schulman | Theophane Weber | J. Schulman | P. Abbeel | N. Heess | T. Weber | John Schulman
[1] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[2] Peter W. Glynn,et al. Likelihood ratio gradient estimation for stochastic systems , 1990, CACM.
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[5] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[6] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.
[9] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[10] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[11] Paul Glasserman,et al. Monte Carlo Methods in Financial Engineering , 2003 .
[12] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[13] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[14] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[15] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[16] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[17] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[18] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[19] David Wingate,et al. Automated Variational Inference in Probabilistic Programming , 2013, ArXiv.
[20] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[21] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[22] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[23] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[24] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[25] Daan Wierstra,et al. Deep AutoRegressive Networks , 2013, ICML.
[26] Sean Gerrish,et al. Black Box Variational Inference , 2013, AISTATS.
[27] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[28] Max Welling,et al. Efficient Gradient-Based Inference through Transformations between Bayes Nets and Neural Nets , 2014, ICML.
[29] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines - Revised , 2015 .
[30] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.