Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input

Equilibrium Propagation (EP) is a biologically inspired learning algorithm for convergent recurrent neural networks, i.e. RNNs that are fed by a static input x and settle to a steady state. Training convergent RNNs consists in adjusting the weights until the steady state of output neurons coincides with a target y. Convergent RNNs can also be trained with the more conventional Backpropagation Through Time (BPTT) algorithm. In its original formulation EP was described in the case of real-time neuronal dynamics, which is computationally costly. In this work, we introduce a discrete-time version of EP with simplified equations and with reduced simulation time, bringing EP closer to practical machine learning tasks. We first prove theoretically, as well as numerically that the neural and weight updates of EP, computed by forward-time dynamics, are step-by-step equal to the ones obtained by BPTT, with gradients computed backward in time. The equality is strict when the transition function of the dynamics derives from a primitive function and the steady state is maintained long enough. We then show for more standard discrete-time neural network dynamics that the same property is approximately respected and we subsequently demonstrate training with EP with equivalent performance to BPTT. In particular, we define the first convolutional architecture trained with EP achieving ~ 1% test error on MNIST, which is the lowest error reported with EP. These results can guide the development of deep neural networks trained with EP.

[1]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[2]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[3]  J. Feldmann,et al.  All-optical spiking neurosynaptic networks with self-learning capabilities , 2019, Nature.

[4]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[5]  Javier R. Movellan,et al.  Contrastive Hebbian Learning in the Continuous Hopfield Model , 1991 .

[6]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[7]  Yoshua Bengio,et al.  Equivalence of Equilibrium Propagation and Recurrent Backpropagation , 2017, Neural Computation.

[8]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[9]  Yoshua Bengio,et al.  Generalization of Equilibrium Propagation to Vector Field Dynamics , 2018, ArXiv.

[10]  Yoshua Bengio,et al.  Towards a Biologically Plausible Backprop , 2016, ArXiv.

[11]  Max Welling,et al.  Initialized Equilibrium Propagation for Backprop-Free Training , 2019, ICLR.

[12]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[13]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[14]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[15]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[16]  Damien Querlioz,et al.  Neuromorphic computing with nanoscale spintronic oscillators , 2017, Nature.

[17]  Max Welling,et al.  Training a Spiking Neural Network with Equilibrium Propagation , 2019, AISTATS.

[18]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[19]  Damien Querlioz,et al.  Vowel recognition with four coupled spin-torque nano-oscillators , 2017, Nature.

[20]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[21]  Big data needs a hardware revolution , 2018, Nature.