论文信息 - MuProp: Unbiased Backpropagation for Stochastic Neural Networks - 字舞流文

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm. Stochastic neural networks combine the power of large parametric functions with that of graphical models, which makes it possible to learn very complex distributions. However, as backpropagation is not directly applicable to stochastic networks that include discrete sampling operations within their computational graph, training such networks remains difficult. We present MuProp, an unbiased gradient estimator for stochastic networks, designed to make this task easier. MuProp improves on the likelihood-ratio estimator by reducing its variance using a control variate based on the first-order Taylor expansion of a mean-field network. Crucially, unlike prior attempts at using backpropagation for training stochastic networks, the resulting estimator is unbiased and well behaved. Our experiments on structured output prediction and discrete latent variable modeling demonstrate that MuProp yields consistently good performance across a range of difficult tasks.

Sergey Levine | Ilya Sutskever | Andriy Mnih | Shixiang Gu | S. Levine | S. Gu | Ilya Sutskever | A. Mnih | I. Sutskever

[1] Yoshua Bengio,et al. Reweighted Wake-Sleep , 2014, ICLR.

[2] Omer Levy,et al. Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .

[3] Ruslan Salakhutdinov,et al. Importance Weighted Autoencoders , 2015, ICLR.

[4] Tapani Raiko,et al. Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[5] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[6] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[7] Yoshua Bengio,et al. Difference Target Propagation , 2014, ECML/PKDD.

[8] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[9] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[10] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[11] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.

[12] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.

[14] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[15] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[16] Michael I. Jordan,et al. Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[17] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[19] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[20] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[21] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[22] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[23] Sean Gerrish,et al. Black Box Variational Inference , 2013, AISTATS.

[24] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[25] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[26] Michael I. Jordan,et al. Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[27] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.

[28] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[29] Parul Parashar,et al. Neural Networks in Machine Learning , 2014 .

[30] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[31] Daan Wierstra,et al. Deep AutoRegressive Networks , 2013, ICML.