Neural sampling machine with stochastic synapse allows brain-like learning and inference

Many real-world mission-critical applications require continual online learning from noisy data and real-time decision making with a defined confidence level. Probabilistic models and stochastic neural networks can explicitly handle the uncertainty in data and allowing adaptive learning on the fly, but their implementation in a low-power substrate remains a challenge. In this work, we introduce a novel hardware fabric that can implement a new class of stochastic neural network called Neural Sampling Machine (NSM) that exploits stochasticity in its synaptic connections for approximate Bayesian inference. Harnessing the inherent non-linearities and stochasticity occurring at the atomic level in emerging materials and devices allows us to capture the synaptic stochasticity occurring at the molecular level in biological synapses. We experimentally demonstrate an in silico hybrid stochastic synapse by pairing a ferroelectric field-effect transistor (FeFET)-based analog weight cell with a two-terminal stochastic selector element. Such a stochastic synapse can be integrated within the well-established crossbar array architecture for compute-in-memory (CIM). We experimentally show that the inherent stochastic switching of the selector element between the insulator and metallic state introduces a multiplicative stochastic noise within the synapses of NSM that samples the conductance states of the FeFET, both during learning and inference. Using experimentally calibrated models, we perform network-level simulations to highlight the salient automatic weight normalization feature introduced by the stochastic synapses of the NSM that paves the way for continual online learning without any offline Batch Normalization. We also showcase the Bayesian inferencing capability introduced by the stochastic synapse during inference mode, thus accounting for uncertainty in data. We report high accuracy of 98.25% on standard image classification task as well as the estimation of data uncertainty in original vs. rotated samples. Building such a stochastic NSM hardware will allow using inspiration from neuroscience to design a ML architecture that can learn and report uncertainty.

[1]  Shimeng Yu,et al.  Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.

[2]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[3]  J. Borst The low synaptic release probability in vivo , 2010, Trends in Neurosciences.

[4]  S. Datta,et al.  Performance Enhancement of Ag/HfO2 Metal Ion Threshold Switch Cross-Point Selectors , 2019, IEEE Electron Device Letters.

[5]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[6]  Shimeng Yu,et al.  A ferroelectric field effect transistor based synaptic weight cell , 2018, Journal of Physics D: Applied Physics.

[7]  Siddharth Joshi,et al.  Supervised Learning in All FeFET-Based Spiking Neural Network: Opportunities and Challenges , 2020, Frontiers in Neuroscience.

[8]  G. W. Burr,et al.  Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element , 2015, 2014 IEEE International Electron Devices Meeting.

[9]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[10]  Suman Datta,et al.  Fundamental mechanism behind volatile and non-volatile switching in metallic conducting bridge RAM , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[11]  M. Trentzsch,et al.  A 28nm HKMG super low power embedded NVM technology based on ferroelectric FETs , 2016, 2016 IEEE International Electron Devices Meeting (IEDM).

[12]  Shimeng Yu,et al.  Fully parallel write/read in resistive synaptic array for accelerating on-chip learning , 2015, Nanotechnology.

[13]  Damien Querlioz,et al.  Bioinspired Programming of Memory Devices for Implementing an Inference Engine , 2015, Proceedings of the IEEE.

[14]  D. Attwell,et al.  Synaptic Energy Use and Supply , 2012, Neuron.

[15]  William B Levy,et al.  Energy-Efficient Neuronal Computation via Quantal Synaptic Failures , 2002, The Journal of Neuroscience.

[16]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[17]  H. C. Jung,et al.  NbO2-based low power and cost effective 1S1R switching for high density cross point ReRAM Application , 2014, Symposium on VLSI Technology.

[18]  H. Hwang,et al.  Comprehensive scaling study of NbO2 insulator-metal-transition selector for cross point array application , 2016 .

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[21]  Xiaochen Peng,et al.  Benchmark of Ferroelectric Transistor-Based Hybrid Precision Synapse for Neural Network Accelerator , 2019, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[22]  N. Righos,et al.  A stackable cross point Phase Change Memory , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).

[23]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[24]  Renjie Liao,et al.  Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes , 2016, ICLR.

[25]  Allen,et al.  VO2: Peierls or Mott-Hubbard? A view from band theory. , 1994, Physical review letters.

[26]  T. M. Rice,et al.  Metal‐Insulator Transitions , 2003 .

[27]  Shimeng Yu,et al.  Exploiting Hybrid Precision for Training and Inference: A 2T-1FeFET Based Analog Synaptic Weight Cell , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).

[28]  Suman Datta,et al.  Fundamental Understanding and Control of Device-to-Device Variation in Deeply Scaled Ferroelectric FETs , 2019, 2019 Symposium on VLSI Technology.

[29]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[30]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[31]  Abhishek Khanna,et al.  Inherent Weight Normalization in Stochastic Neural Networks , 2019, NeurIPS.

[32]  Shimeng Yu,et al.  Ferroelectric FET analog synapse for acceleration of deep neural network training , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[33]  Shimeng Yu,et al.  Scaling-up resistive synaptic arrays for neuro-inspired architecture: Challenges and prospect , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[34]  Rajesh P. N. Rao,et al.  Bayesian brain : probabilistic approaches to neural coding , 2006 .

[35]  Wolfgang Maass,et al.  Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons , 2011, PLoS Comput. Biol..

[36]  Siddharth Joshi,et al.  Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines , 2015, Front. Neurosci..

[38]  Suman Datta,et al.  Programmable coupled oscillators for synchronized locomotion , 2019, Nature Communications.

[39]  E. Prescott,et al.  Investment Under Uncertainty , 1971 .

[40]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[41]  Pritish Narayanan,et al.  MIEC (mixed-ionic-electronic-conduction)-based access devices for non-volatile crossbar memory arrays , 2014 .

[42]  C. N. Berglund,et al.  Electronic Properties of V O 2 near the Semiconductor-Metal Transition , 1969 .

[43]  Byoungil Lee,et al.  Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing. , 2012, Nano letters.