论文信息 - On the role of synaptic stochasticity in training low-precision neural networks - 字舞流文

On the role of synaptic stochasticity in training low-precision neural networks

Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension that allows to train discrete deep neural networks is also investigated.

Hilbert J. Kappen | Carlo Baldassi | Carlo Lucibello | Luca Saglietti | Riccardo Zecchina | Federica Gerace | Enzo Tartaglione | H. Kappen | R. Zecchina | Carlo Baldassi | C. Lucibello | Luca Saglietti | Federica Gerace | Enzo Tartaglione

[1] Christian Borgs,et al. Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes , 2016, Proceedings of the National Academy of Sciences.

[2] Ole Winther,et al. Optimal perceptron learning: as online Bayesian approach , 1999 .

[3] H. Horner. Dynamics of learning for the binary perceptron problem , 1992 .

[4] G. Parisi,et al. The simplest model of jamming , 2015, 1501.03397.

[5] Riccardo Zecchina,et al. Learning by message-passing in networks of discrete synapses , 2005, Physical review letters.

[6] Carlo Baldassi. Generalization Learning in a Perceptron with Binary Synapses , 2009, 1211.3024.

[7] Heike Freud,et al. On Line Learning In Neural Networks , 2016 .

[8] Yonatan Loewenstein,et al. Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity , 2006, Proceedings of the National Academy of Sciences.

[9] Carlo Baldassi,et al. Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses. , 2015, Physical review letters.

[10] Carlo Baldassi,et al. Local entropy as a measure for sampling solutions in Constraint Satisfaction Problems , 2015 .

[11] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.

[12] Lenka Zdeborová,et al. Constraint satisfaction problems with isolated solutions are hard , 2008, ArXiv.

[13] Carlo Baldassi,et al. Learning may need only a few bits of synaptic precision. , 2016, Physical review. E.

[14] Monasson,et al. Analytical and numerical study of internal representations in multilayer neural networks with binary weights. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[15] Guilhem Semerjian,et al. The large deviations of the whitening process in random constraint satisfaction problems , 2016, ArXiv.

[16] Julia Deniz Yuret. Knet : beginning deep learning with 100 lines of , 2016 .

[17] E. Gardner. The space of interactions in neural network models , 1988 .

[18] Carlo Baldassi,et al. A Max-Sum algorithm for training discrete neural networks , 2015, ArXiv.

[19] Carlo Baldassi,et al. Efficiency of quantum vs. classical annealing in nonconvex learning problems , 2017, Proceedings of the National Academy of Sciences.

[20] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[21] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[23] M. Mézard,et al. Spin Glass Theory and Beyond , 1987 .

[24] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[25] W. Krauth,et al. Storage capacity of memory networks with binary couplings , 1989 .

[26] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.

[27] Yoshiyuki Kabashima,et al. Origin of the computational hardness for learning with binary synapses , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28] Manfred Opper,et al. A Bayesian approach to on-line learning , 1999 .

[29] L. Goddard. Information Theory , 1962, Nature.

[30] Ryan P. Adams,et al. Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[31] E. Gardner,et al. Optimal storage properties of neural network models , 1988 .

[32] M. Mézard. The space of interactions in neural networks: Gardner's computation with the cavity method , 1989 .

[33] S. Wang,et al. Graded bidirectional synaptic plasticity is composed of switch-like unitary events. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .

[35] G. Parisi,et al. Recipes for metastable states in spin glasses , 1995 .

[36] T. Sejnowski,et al. Hippocampal Spine Head Sizes Are Highly Precise , 2015, bioRxiv.

[37] Opper,et al. Mean field approach to Bayes learning in feed-forward neural networks. , 1996, Physical review letters.