Contrasting Advantages of Learning With Random Weights and Backpropagation in Non-Volatile Memory Neural Networks

Recently, a Cambrian explosion of a novel, non-volatile memory (NVM) devices known as memristive devices have inspired effort in building hardware neural networks that learn like the brain. Early experimental prototypes built simple perceptrons from nanosynapses, and recently, fully-connected multi-layer perceptron (MLP) learning systems have been realized. However, while backpropagating learning systems pair well with high-precision computer memories and achieve state-of-the-art performances, this typically comes with a massive energy budget. For future Internet of Things/peripheral use cases, system energy footprint will be a major constraint, and emerging NVM devices may fill the gap by sacrificing high bit precision for lower energy. In this paper, we contrast the well-known MLP approach with the extreme learning machine (ELM) or NoProp approach, which uses a large layer of random weights to improve the separability of high-dimensional tasks, and is usually considered inferior in a software context. However, we find that when taking the device non-linearity into account, NoProp manages to equal hardware MLP system in terms of accuracy. While also using a sign-based adaptation of the delta rule for energy-savings, we find that NoProp can learn effectively with four to six ’bits’ of device analog capacity, while MLP requires eight-bit capacity with the same rule. This may allow the requirements for memristive devices to be relaxed in the context of online learning. By comparing the energy footprint of these systems for several candidate nanosynapses and realistic peripherals, we confirm that memristive NoProp systems save energy compared with MLP systems. Lastly, we show that ELM/NoProp systems can achieve better generalization abilities than nanosynaptic MLP systems when paired with pre-processing layers (which do not require backpropagated error). Collectively, these advantages make such systems worthy of consideration in future accelerators or embedded hardware.

[1]  Manan Suri,et al.  Design Exploration of IoT centric Neural Inference Accelerators , 2018, ACM Great Lakes Symposium on VLSI.

[2]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[3]  Johannes Schemmel,et al.  Is a 4-Bit Synaptic Weight Resolution Enough? – Constraints on Enabling Spike-Timing Dependent Plasticity in Neuromorphic Hardware , 2012, Front. Neurosci..

[4]  André van Schaik,et al.  Online and adaptive pseudoinverse solutions for ELM weights , 2015, Neurocomputing.

[5]  Fabien Alibart,et al.  OXRAM based ELM architecture for multi-class classification applications , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[6]  Irem Boybat,et al.  Non-volatile memory as hardware synapse in neuromorphic computing: A first look at reliability issues , 2015, 2015 IEEE International Reliability Physics Symposium.

[7]  O. Cueto,et al.  Physical aspects of low power synapses based on phase change memory devices , 2012 .

[8]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[9]  Bernard Widrow,et al.  The No-Prop algorithm: A new learning algorithm for multilayer neural networks , 2013, Neural Networks.

[10]  Manan Suri,et al.  Exploiting Intrinsic Variability of Filamentary Resistive Memory for Extreme Learning Machine Architectures , 2015, IEEE Transactions on Nanotechnology.

[11]  Weijie Wang,et al.  Enabling Universal Memory by Overcoming the Contradictory Speed and Stability Nature of Phase-Change Materials , 2012, Scientific Reports.

[12]  J. Yang,et al.  Sub-10 nm Ta Channel Responsible for Superior Performance of a HfO2 Memristor , 2016, Scientific Reports.

[13]  Shimeng Yu,et al.  Exploiting Hybrid Precision for Training and Inference: A 2T-1FeFET Based Analog Synaptic Weight Cell , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).

[14]  Yukihiro Kaneko,et al.  Back-Propagation Operation for Analog Neural Network Hardware with Synapse Components Having Hysteresis Characteristics , 2014, PloS one.

[15]  Steven J. Plimpton,et al.  Achieving ideal accuracies in analog neuromorphic computing using periodic carry , 2017, 2017 Symposium on VLSI Technology.

[16]  M. Marinella,et al.  A non-volatile organic electrochemical device as a low-voltage artificial synapse for neuromorphic computing. , 2017, Nature materials.

[17]  Pritish Narayanan,et al.  Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165 000 Synapses) Using Phase-Change Memory as the Synaptic Weight Element , 2014, IEEE Transactions on Electron Devices.

[18]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[19]  Ligang Gao,et al.  High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[20]  Jacques-Olivier Klein,et al.  Physical Realization of a Supervised Learning System Built with Organic Memristive Synapses , 2016, Scientific Reports.

[21]  G. Huang,et al.  An Energy-Efficient Nonvolatile In-Memory Computing Architecture for Extreme Learning Machine by Domain-Wall Nanowire Devices , 2015, IEEE Transactions on Nanotechnology.

[22]  Scott Keene,et al.  Optimized pulsed write schemes improve linearity and write speed for low-power organic neuromorphic devices , 2018 .

[23]  Majid Ahmadi,et al.  Hyperbolic tangent passive resistive-type neuron , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[24]  Farnood Merrikh-Bayat,et al.  Efficient training algorithms for neural networks based on memristive crossbar circuits , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[25]  H. Hwang,et al.  Improved Synaptic Behavior Under Identical Pulses Using AlOx/HfO2 Bilayer RRAM Array for Neuromorphic Systems , 2016, IEEE Electron Device Letters.

[26]  Derek Abbott,et al.  Memristor-based synaptic networks and logical operations using in-situ computing , 2011, 2011 Seventh International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[27]  Shimeng Yu,et al.  Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.

[28]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  P. Narayanan,et al.  Access devices for 3D crosspoint memorya) , 2014 .

[30]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[31]  Christof Teuscher,et al.  Synaptic Weight States in a Locally Competitive Algorithm for Neuromorphic Memristive Hardware , 2015, IEEE Transactions on Nanotechnology.

[32]  Yusuf Leblebici,et al.  Large-scale neural networks implemented with Non-Volatile Memory as the synaptic weight element: Impact of conductance response , 2016, 2016 46th European Solid-State Device Research Conference (ESSDERC).

[33]  Jun Miao,et al.  Hierarchical Extreme Learning Machine for unsupervised representation learning , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[34]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[35]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[36]  Damien Querlioz,et al.  Learning with memristive devices: How should we model their behavior? , 2011, 2011 IEEE/ACM International Symposium on Nanoscale Architectures.

[37]  Jongin Kim,et al.  Electronic system with memristive synapses for pattern recognition , 2015, Scientific Reports.

[38]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Spyros Stathopoulos,et al.  Multibit memory operation of metal-oxide bi-layer memristors , 2017, Scientific Reports.

[40]  Avinoam Kolodny,et al.  Memristor-Based Multilayer Neural Networks With Online Gradient Descent Training , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[41]  John L. Wyatt,et al.  The Softmax Nonlinearity: Derivation Using Statistical Mechanics and Useful Properties as a Multiterminal Analog Circuit Element , 1993, NIPS.

[42]  Massimiliano Di Ventra,et al.  Practical Approach to Programmable Analog Circuits With Memristors , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[43]  Witali L. Dunin-Barkowski,et al.  An approximate backpropagation learning rule for memristor based neural networks using synaptic plasticity , 2015, Neurocomputing.

[44]  Wolfram Schiffmann,et al.  Speeding Up Backpropagation Algorithms by Using Cross-Entropy Combined with Pattern Normalization , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[45]  Steven J. Plimpton,et al.  Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator , 2017, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[46]  Hao Jiang,et al.  A memristor-based neuromorphic engine with a current sensing scheme for artificial neural network applications , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[47]  Wenqiang Zhang,et al.  Sign backpropagation: An on-chip learning algorithm for analog RRAM neuromorphic computing systems , 2018, Neural Networks.

[48]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[49]  Henry Markram,et al.  Neural Networks with Dynamic Synapses , 1998, Neural Computation.

[50]  E. Leobandung,et al.  Capacitor-based Cross-point Array for Analog Neural Network with Record Symmetry and Linearity , 2018, 2018 IEEE Symposium on VLSI Technology.

[51]  Fabien Alibart,et al.  Pattern classification by memristive crossbar circuits using ex situ and in situ training , 2013, Nature Communications.

[52]  X. Miao,et al.  Ultrafast Synaptic Events in a Chalcogenide Memristor , 2013, Scientific Reports.

[53]  Fabien Alibart,et al.  Exploiting the short-term to long-term plasticity transition in memristive nanodevice learning architectures , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[54]  S. Jo,et al.  3D-stackable crossbar resistive memory based on Field Assisted Superlinear Threshold (FAST) selector , 2014, 2014 IEEE International Electron Devices Meeting.

[55]  Shimeng Yu,et al.  Mitigating effects of non-ideal synaptic device characteristics for on-chip learning , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[56]  Jennifer Hasler,et al.  Finding a roadmap to achieve large neuromorphic hardware systems , 2013, Front. Neurosci..

[57]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[58]  André van Schaik,et al.  An Online Learning Algorithm for Neuromorphic Hardware Implementation , 2015, ArXiv.

[59]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[60]  Jacques-Olivier Klein,et al.  On-Chip Universal Supervised Learning Methods for Neuro-Inspired Block of Memristive Nanodevices , 2015, ACM J. Emerg. Technol. Comput. Syst..

[61]  Guang-Bin Huang,et al.  Extreme Learning Machine for Multilayer Perceptron , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[62]  Zhaohao Wang,et al.  Ultrahigh Density Memristor Neural Crossbar for On-Chip Supervised Learning , 2015, IEEE Transactions on Nanotechnology.

[63]  R. Williams,et al.  Measuring the switching dynamics and energy efficiency of tantalum oxide memristors , 2011, Nanotechnology.

[64]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[65]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[66]  Majid Ahmadi,et al.  Analog Implementation of a Novel Resistive-Type Sigmoidal Neuron , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[67]  Qing Wu,et al.  Efficient and self-adaptive in-situ learning in multilayer memristor neural networks , 2018, Nature Communications.

[68]  Fabien Alibart,et al.  Plasticity in memristive devices for spiking neural networks , 2015, Front. Neurosci..

[69]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[70]  J Joshua Yang,et al.  Memristive devices for computing. , 2013, Nature nanotechnology.

[71]  Farnood Merrikh-Bayat,et al.  Training and operation of an integrated neuromorphic network based on metal-oxide memristors , 2014, Nature.

[72]  Ligang Gao,et al.  Analog-input analog-weight dot-product operation with Ag/a-Si/Pt memristive devices , 2012, 2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC).

[73]  Wei Yang Lu,et al.  Nanoscale memristor device as synapse in neuromorphic systems. , 2010, Nano letters.

[74]  Pritish Narayanan,et al.  Neuromorphic computing using non-volatile memory , 2017 .

[75]  R. Zunino,et al.  Analog implementation of the SoftMax function , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[76]  Sapan Agarwal,et al.  Li‐Ion Synaptic Transistor for Low Power Analog Computing , 2017, Advanced materials.

[77]  Ojas Parekh,et al.  Energy Scaling Advantages of Resistive Memory Crossbar Based Computation and Its Application to Sparse Coding , 2016, Front. Neurosci..