STDP-Compatible Approximation of Backpropagation in an Energy-Based Model

We show that Langevin Markov chain Monte Carlo inference in an energy-based model with latent variables has the property that the early steps of inference, starting from a stationary point, correspond to propagating error gradients into internal layers, similar to backpropagation. The backpropagated error is with respect to output units that have received an outside driving force pushing them away from the stationary point. Backpropagated error gradients correspond to temporal derivatives with respect to the activation of hidden units. These lead to a weight update proportional to the product of the presynaptic firing rate and the temporal rate of change of the postsynaptic firing rate. Simulations and a theoretical argument suggest that this rate-based update rule is consistent with those associated with spike-timing-dependent plasticity. The ideas presented in this article could be an element of a theory for explaining how brains perform credit assignment in deep hierarchies as efficiently as backpropagation does, with neural computation corresponding to both approximate inference in continuous-valued latent variables and error backpropagation, at the same time.

[1]  G. Shepherd The Synaptic Organization of the Brain , 1979 .

[2]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  J J Hopfield,et al.  Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[5]  Geoffrey E. Hinton,et al.  Learning Representations by Recirculation , 1987, NIPS.

[6]  Nathan Intrator,et al.  Objective function formulation of the BCM theory of visual cortical plasticity: Statistical connections, stability conditions , 1992, Neural Networks.

[7]  Wulfram Gerstner,et al.  A neuronal learning rule for sub-millisecond temporal coding , 1996, Nature.

[8]  H. Markram,et al.  Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997, Science.

[9]  G. Bi,et al.  Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.

[10]  Xiaohui Xie,et al.  Spike-based Learning Rules and Stabilization of Persistent Neural Activity , 1999, NIPS.

[11]  Mark C. W. van Rossum,et al.  Stable Hebbian Learning from Spike Timing-Dependent Plasticity , 2000, The Journal of Neuroscience.

[12]  G. Bi,et al.  Synaptic modification by correlated activity: Hebb's postulate revisited. , 2001, Annual review of neuroscience.

[13]  Y. Dan,et al.  Spike-timing-dependent synaptic modification induced by natural spike trains , 2002, Nature.

[14]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[16]  Ila R Fiete,et al.  Gradient learning in spiking neural networks by dynamic perturbation of conductances. , 2006, Physical review letters.

[17]  Karl J. Friston,et al.  Free-energy and the brain , 2007, Synthese.

[18]  Wulfram Gerstner,et al.  Phenomenological models of synaptic plasticity based on spike timing , 2008, Biological Cybernetics.

[19]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[20]  Pascal Vincent,et al.  A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.

[21]  József Fiser,et al.  Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment , 2011, Science.

[22]  J. Pfister,et al.  A triplet spike-timing–dependent plasticity model generalizes the Bienenstock–Cooper–Munro rule to higher-order spatiotemporal correlations , 2011, Proceedings of the National Academy of Sciences.

[23]  W. Gerstner,et al.  Spike-Timing-Dependent Plasticity: A Comprehensive Overview , 2012, Front. Syn. Neurosci..

[24]  D. Feldman The Spike-Timing Dependence of Plasticity , 2012, Neuron.

[25]  Yoshua Bengio,et al.  What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..

[26]  Daniel Cownden,et al.  Random feedback weights support learning in deep neural networks , 2014, ArXiv.

[27]  Wulfram Gerstner,et al.  Stochastic variational learning in recurrent spiking networks , 2014, Front. Comput. Neurosci..

[28]  Sanjeev Arora,et al.  Why are deep nets reversible: A simple theory, with implications for training , 2015, ArXiv.

[29]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[30]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..