论文信息 - Early Inference in Energy-Based Models Approximates Back-Propagation

Early Inference in Energy-Based Models Approximates Back-Propagation

We show that Langevin MCMC inference in an energy-based model with latent variables has the property that the early steps of inference, starting from a stationary point, correspond to propagating error gradients into internal layers, similarly to back-propagation. The error that is back-propagated is with respect to visible units that have received an outside driving force pushing them away from the stationary point. Back-propagated error gradients correspond to temporal derivatives of the activation of hidden units. This observation could be an element of a theory for explaining how brains perform credit assignment in deep hierarchies as efficiently as back-propagation does. In this theory, the continuous-valued latent variables correspond to averaged voltage potential (across time, spikes, and possibly neurons in the same minicolumn), and neural computation corresponds to approximate inference and error back-propagation at the same time.

Yoshua Bengio | Yoshua Bengio

[1] Karl J. Friston,et al. Free-energy and the brain , 2007, Synthese.

[2] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..

[3] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[4] W. Gerstner,et al. Spike-Timing-Dependent Plasticity: A Comprehensive Overview , 2012, Front. Syn. Neurosci..

[5] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[6] Wulfram Gerstner,et al. Stochastic variational learning in recurrent spiking networks , 2014, Front. Comput. Neurosci..

[7] Pascal Vincent,et al. A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.

[8] Yoshua Bengio,et al. STDP as presynaptic activity times rate of change of postsynaptic activity , 2015, 1509.05936.

[9] Yoshua Bengio,et al. Towards Biologically Plausible Deep Learning , 2015, ArXiv.

[10] Sanjeev Arora,et al. Why are deep nets reversible: A simple theory, with implications for training , 2015, ArXiv.

[11] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.