Noise Helps Optimization Escape From Saddle Points in the Synaptic Plasticity

Numerous experimental studies suggest that noise is inherent in the human brain. However, the functional importance of noise remains unknown. n particular, from a computational perspective, such stochasticity is potentially harmful to brain function. In machine learning, a large number of saddle points are surrounded by high error plateaus and give the illusion of the existence of local minimum. As a result, being trapped in the saddle points can dramatically impair learning and adding noise will attack such saddle point problems in high-dimensional optimization, especially under the strict saddle condition. Motivated by these arguments, we propose one biologically plausible noise structure and demonstrate that noise can efficiently improve the optimization performance of spiking neural networks based on stochastic gradient descent. The strict saddle condition for synaptic plasticity is deduced, and under such conditions, noise can help optimization escape from saddle points on high dimensional domains. The theoretical results explain the stochasticity of synapses and guide us on how to make use of noise. In addition, we provide biological interpretations of proposed noise structures from two points: one based on the free energy principle in neuroscience and another based on observations of in vivo experiments. Our simulation results manifest that in the learning and test phase, the accuracy of synaptic sampling with noise is almost 20% higher than that without noise for synthesis dataset, and the gain in accuracy with/without noise is at least 10% for the MNIST and CIFAR-10 dataset. Our study provides a new learning framework for the brain and sheds new light on deep noisy spiking neural networks.

[1]  KM Harris,et al.  Dendritic spines of CA 1 pyramidal cells in the rat hippocampus: serial electron microscopy with reference to their biophysical characteristics , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[2]  David Kappel,et al.  Network Plasticity as Bayesian Inference , 2015, PLoS Comput. Biol..

[3]  K. Martin,et al.  The Cell Biology of Synaptic Plasticity , 2011, Science.

[4]  Guilherme Ost,et al.  Hydrodynamic Limit for Spatially Structured Interacting Neurons , 2015 .

[5]  U. Bhalla,et al.  Emergent properties of networks of biological signaling pathways. , 1999, Science.

[6]  A. Triller,et al.  From the stochasticity of molecular processes to the variability of synaptic transmission , 2011, Nature Reviews Neuroscience.

[7]  Wolfgang Maass,et al.  Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity , 2013, PLoS Comput. Biol..

[8]  Shoichi Kai,et al.  Noise-induced entrainment and stochastic resonance in human brain waves. , 2002, Physical review letters.

[9]  Michael I. Jordan,et al.  Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[10]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[11]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[12]  A. Reyes,et al.  Relation between single neuron and population spiking statistics and effects on network activity. , 2006, Physical review letters.

[13]  P. J. Sjöström,et al.  Rate, Timing, and Cooperativity Jointly Determine Cortical Synaptic Plasticity , 2001, Neuron.

[14]  A. Galves,et al.  Modeling networks of spiking neurons as interacting processes with memory of variable length , 2015, 1502.06446.

[15]  Michael I. Jordan,et al.  How to Escape Saddle Points Efficiently , 2017, ICML.

[16]  Mateus Joffily,et al.  Emotional Valence and the Free-Energy Principle , 2013, PLoS Comput. Biol..

[17]  Richard L. Huganir,et al.  Activity-Dependent Dendritic Spine Structural Plasticity Is Regulated by Small GTPase Rap1 and Its Target AF-6 , 2005, Neuron.

[18]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[19]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[20]  Timothy D. Hanks,et al.  Probabilistic Population Codes for Bayesian Decision Making , 2008, Neuron.

[21]  M. Tsakiris,et al.  The free-energy self: A predictive coding account of self-recognition , 2014, Neuroscience & Biobehavioral Reviews.

[22]  A. Pouget,et al.  Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.

[23]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[24]  Karel Svoboda,et al.  Rapid Redistribution of Synaptic PSD-95 in the Neocortex In Vivo , 2006, PLoS biology.

[25]  A. Galves,et al.  Infinite Systems of Interacting Chains with Memory of Variable Length—A Stochastic Model for Biological Neural Nets , 2012, 1212.5505.

[26]  T. Branco,et al.  The probability of neurotransmitter release: variability and feedback control at single synapses , 2009, Nature Reviews Neuroscience.

[27]  Yasushi Miyashita,et al.  Dendritic spine geometry is critical for AMPA receptor expression in hippocampal CA1 pyramidal neurons , 2001, Nature Neuroscience.

[28]  A. Yuille,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[29]  Yan V Fyodorov,et al.  Replica Symmetry Breaking Condition Exposed by Random Matrix Calculation of Landscape Complexity , 2007, cond-mat/0702601.

[30]  Hédi Soula,et al.  Spontaneous Dynamics of Asymmetric Random Recurrent Spiking Neural Networks , 2004, Neural Computation.

[31]  B Cessac,et al.  A discrete time neural network model with spiking neurons: II: Dynamics with noise , 2010, Journal of mathematical biology.

[32]  Tomoki Fukai,et al.  A Stochastic Method to Predict the Consequence of Arbitrary Forms of Spike-Timing-Dependent Plasticity , 2003, Neural Computation.

[33]  K. Svoboda,et al.  Spine growth precedes synapse formation in the adult neocortex in vivo , 2006, Nature Neuroscience.

[34]  Bruno Cessac,et al.  A View of Neural Networks as Dynamical Systems , 2009, Int. J. Bifurc. Chaos.

[35]  D Holcman,et al.  Calcium dynamics in dendritic spines, modeling and experiments. , 2005, Cell calcium.

[36]  Stefan Habenschuss,et al.  Emergence of Optimal Decoding of Population Codes Through STDP , 2013, Neural Computation.

[37]  Peter Somogyi,et al.  Cell Type and Pathway Dependence of Synaptic AMPA Receptor Number and Variability in the Hippocampus , 1998, Neuron.

[38]  G. Bi,et al.  Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type , 1998, The Journal of Neuroscience.

[39]  Woodrow L. Shew,et al.  Inhibition causes ceaseless dynamics in networks of excitable nodes. , 2013, Physical review letters.

[40]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[41]  Anthony N. Burkitt,et al.  A Review of the Integrate-and-fire Neuron Model: I. Homogeneous Synaptic Input , 2006, Biological Cybernetics.

[42]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[43]  Wulfram Gerstner,et al.  From Stochastic Nonlinear Integrate-and-Fire to Generalized Linear Models , 2011, NIPS.

[44]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[45]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[46]  L. Pinneo On noise in the nervous system. , 1966, Psychological review.

[47]  Wulfram Gerstner,et al.  Predicting spike timing of neocortical pyramidal neurons by simple threshold models , 2006, Journal of Computational Neuroscience.

[48]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[49]  Henry C. Tuckwell,et al.  Introduction to theoretical neurobiology , 1988 .

[50]  H. Kasai,et al.  Principles of Long-Term Dynamics of Dendritic Spines , 2008, The Journal of Neuroscience.

[51]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[52]  Kristen M Harris,et al.  Structural changes at dendritic spine synapses during long-term potentiation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[53]  S. Swamy,et al.  Neurotransmitters Drive Combinatorial Multistate Postsynaptic Density Networks , 2009, Science Signaling.

[54]  Wolfgang Maass,et al.  Noise as a Resource for Computation and Learning in Networks of Spiking Neurons , 2014, Proceedings of the IEEE.

[55]  T. Sejnowski,et al.  Discovering Spike Patterns in Neuronal Responses , 2004, The Journal of Neuroscience.

[56]  M. Colombo,et al.  First principles in the life sciences: the free-energy principle, organicism, and mechanism , 2018, Synthese.

[57]  David Kappel,et al.  A Dynamic Connectome Supports the Emergence of Stable Computational Function of Neural Circuits through Reward-Based Learning , 2017, eNeuro.

[58]  Sophie Denève,et al.  Bayesian Spiking Neurons I: Inference , 2008, Neural Computation.

[59]  F. Engert,et al.  Dendritic spine changes associated with hippocampal long-term synaptic plasticity , 1999, Nature.

[60]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[61]  Petter Laake,et al.  Different modes of expression of AMPA and NMDA receptors in hippocampal synapses , 1999, Nature Neuroscience.

[62]  L. Abbott,et al.  Synaptic computation , 2004, Nature.

[63]  Wolfgang Maass,et al.  STDP enables spiking neurons to detect hidden causes of their inputs , 2009, NIPS.

[64]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.