Making Memristive Neural Network Accelerators Reliable

Deep neural networks (DNNs) have attracted substantial interest in recent years due to their superior performance on many classification and regression tasks as compared to other supervised learning models. DNNs often require a large amount of data movement, resulting in performance and energy overheads. One promising way to address this problem is to design an accelerator based on in-situ analog computing that leverages the fundamental electrical properties of memristive circuits to perform matrix-vector multiplication. Recent work on analog neural network accelerators has shown great potential in improving both the system performance and the energy efficiency. However, detecting and correcting the errors that occur during in-memory analog computation remains largely unexplored. The same electrical properties that provide the performance and energy improvements make these systems especially susceptible to errors, which can severely hurt the accuracy of the neural network accelerators. This paper examines a new error correction scheme for analog neural network accelerators based on arithmetic codes. The proposed scheme encodes the data through multiplication by an integer, which preserves addition operations through the distributive property. Error detection and correction are performed through a modulus operation and a correction table lookup. This basic scheme is further improved by data-aware encoding to exploit the state dependence of the errors, and by knowledge of how critical each portion of the computation is to overall system accuracy. By leveraging the observation that a physical row that contains fewer 1s is less susceptible to an error, the proposed scheme increases the effective error correction capability with less than 4.5% area and less than 4.7% energy overheads. When applied to a memristive DNN accelerator performing inference on the MNIST and ILSVRC-2012 datasets, the proposed technique reduces the respective misclassification rates by 1.5x and 1.1x.

[1]  Christof Fetzer,et al.  AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware , 2009, SAFECOMP.

[2]  D. Gilmer,et al.  Random telegraph noise (RTN) in scaled RRAM devices , 2013, 2013 IEEE International Reliability Physics Symposium (IRPS).

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Chia-En Huang,et al.  Electron trapping effect on the switching behavior of contact RRAM devices through random telegraph noise analysis , 2010, 2010 International Electron Devices Meeting.

[5]  Algirdas Avizienis,et al.  Arithmetic Error Codes: Cost and Effectiveness Studies for Application in Digital System Design , 1971, IEEE Transactions on Computers.

[6]  Hadi Esmaeilzadeh,et al.  TABLA: A unified template-based framework for accelerating statistical machine learning , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[7]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[8]  S. Ambrogio,et al.  Statistical Fluctuations in HfOx Resistive-Switching Memory: Part II—Random Telegraph Noise , 2014, IEEE Transactions on Electron Devices.

[9]  Yiran Chen,et al.  PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[10]  Christof Fetzer,et al.  ANB- and ANBDmem-Encoding: Detecting Hardware Errors in Software , 2010, SAFECOMP.

[11]  Kevin Skadron,et al.  Scaling with Design Constraints: Predicting the Future of Big Chips , 2011, IEEE Micro.

[12]  Chung-Wei Hsu,et al.  Self-rectifying bipolar TaOx/TiO2 RRAM with superior endurance over 1012 cycles for 3D high-density storage-class memory , 2013, 2013 Symposium on VLSI Technology.

[13]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14]  D. A. Wilbur Thermal Agitation of Electricity in Conductors. , 1932 .

[15]  David T. Brown Error Detecting and Correcting Binary Codes for Arithmetic Operations , 1960, IRE Trans. Electron. Comput..

[16]  H. Nyquist Thermal Agitation of Electric Charge in Conductors , 1928 .

[17]  F. J. Kub,et al.  Programmable analog vector-matrix multipliers , 1990 .

[18]  Hassan Mostafa,et al.  Yield maximization of TiO2>/sub> memristor-based memory arrays , 2014, 2014 26th International Conference on Microelectronics (ICM).

[19]  David M. Mandelbaum,et al.  Arithmetic codes with large distance , 1967, IEEE Trans. Inf. Theory.

[20]  S. Ambrogio,et al.  Understanding switching variability and random telegraph noise in resistive RAM , 2013, 2013 IEEE International Electron Devices Meeting.

[21]  Ligang Gao,et al.  High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[22]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[23]  Xuefei Ning,et al.  Fault-tolerant training with on-line fault detection for RRAM-based neural computing systems , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[24]  D K Pradhan Design of Fault-Tolerant Computers Using ROM as Basic Building Block. , 1980 .

[25]  Yang Xiao,et al.  Low power memristor-based ReRAM design with Error Correcting Code , 2012, 17th Asia and South Pacific Design Automation Conference.

[26]  Engin Ipek,et al.  Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning , 2017 .

[27]  Albert Chin,et al.  Novel Ultra-low power RRAM with good endurance and retention , 2010, 2010 Symposium on VLSI Technology.

[28]  M. Terai,et al.  Effect of bottom electrode of ReRAM with Ta2O5/TiO2 stack on RTN and retention , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).

[29]  D. Ielmini,et al.  Resistance-dependent amplitude of random telegraph-signal noise in resistive switching memories , 2010 .

[30]  Rami G. Melhem,et al.  RDIS: A recursively defined invertible set scheme to tolerate multiple stuck-at faults in resistive memory , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[31]  G. Chen,et al.  A 0.13µm 8Mb logic based CuxSiyO resistive memory with self-adaptive yield enhancement and operation power reduction , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[32]  Norman P. Jouppi,et al.  CACTI 6.0: A Tool to Model Large Caches , 2009 .

[33]  Jiwu Shu,et al.  Aegis: Partitioning data block for efficient recovery of stuck-at-faults in phase change memory , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[34]  Gert Cauwenberghs,et al.  Kerneltron: Support Vector 'Machine' in Silicon , 2002, SVM.

[35]  Lawrence D. Jackel,et al.  An analog neural network processor with programmable topology , 1991 .

[36]  Gert Cauwenberghs,et al.  Charge-mode parallel architecture for matrix-vector multiplication , 2000, Proceedings of the 43rd IEEE Midwest Symposium on Circuits and Systems (Cat.No.CH37144).

[37]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[38]  Karin Strauss,et al.  Use ECP, not ECC, for hard failures in resistive memories , 2010, ISCA.

[39]  Wei Lu,et al.  Random telegraph noise and resistance switching analysis of oxide based resistive memory. , 2014, Nanoscale.

[40]  F. J. Kub,et al.  Random address 32*32 programmable analog vector-matrix multiplier for artificial neural networks , 1990, IEEE Proceedings of the Custom Integrated Circuits Conference.

[41]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[42]  Wei D. Lu,et al.  Pattern recognition with memristor networks , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[43]  T. R. N. Rao,et al.  Cyclic and multiresidue codes for arithmetic operations , 1971, IEEE Trans. Inf. Theory.

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[46]  Norman P. Jouppi,et al.  Understanding the trade-offs in multi-level cell ReRAM memory design , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[47]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[48]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[49]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[50]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[51]  Avinoam Kolodny,et al.  Memristor-Based Multilayer Neural Networks With Online Gradient Descent Training , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[52]  T. R. N. Rao,et al.  Biresidue Error-Correcting Codes for Computer Arithmetic , 1970, IEEE Transactions on Computers.

[53]  Catherine Graves,et al.  Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[54]  Ute Schiffel,et al.  Hardware error detection using AN-Codes , 2010 .

[55]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[56]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[57]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[58]  Chao Du,et al.  Feature Extraction Using Memristor Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Rainer Waser,et al.  Probing Cu doped Ge0.3Se0.7 based resistance switching memory devices with random telegraph noise , 2010 .

[60]  Dae-Hyun Kim,et al.  ArchShield: architectural framework for assisting DRAM scaling by tolerating high error rates , 2013, ISCA.

[61]  Hyunsang Hwang,et al.  Noise-Analysis-Based Model of Filamentary Switching ReRAM With $\hbox{ZrO}_{x}/\hbox{HfO}_{x}$ Stacks , 2011, IEEE Electron Device Letters.