Design Guidelines of RRAM based Neural-Processing-Unit: A Joint Device-Circuit-Algorithm Analysis

RRAM based neural-processing-unit (NPU) is emerging for processing general purpose machine intelligence algorithms with ultra-high energy efficiency, while the imperfections of the analog devices and cross-point arrays make the practical application more complicated. In order to improve accuracy and robustness of the NPU, device-circuit-algorithm codesign with consideration of underlying device and array characteristics should outperform the optimization of individual device or algorithm. In this work, we provide a joint device-circuit-algorithm analysis and propose the corresponding design guidelines. Key innovations include: 1) An end-to-end simulator for RRAM NPU is developed with an integrated framework from device to algorithm. 2) The complete design of circuit and architecture for RRAM NPU is provided to make the analysis much close to the real prototype. 3) A large-scale neural network as well as other general-purpose networks are processed for the study of device-circuit interaction. 4) Accuracy loss from non-idealities of RRAM, such as I-V nonlinearity, noises of analog resistance levels, voltage-drop for interconnect, ADC/DAC precision, are evaluated for the NPU design.

[1]  Wenguang Chen,et al.  Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler , 2017, ASPLOS.

[2]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[3]  Suman Datta,et al.  The era of hyper-scaling in electronics , 2018, Nature Electronics.

[4]  Tayfun Gokmen,et al.  The Next Generation of Deep Learning Hardware: Analog Computing , 2019, Proceedings of the IEEE.

[5]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[6]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[7]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[8]  Xiaochen Peng,et al.  NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[10]  Wei Lu,et al.  The future of electronics based on memristive systems , 2018, Nature Electronics.

[11]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[12]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[13]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  H.-S. Philip Wong,et al.  Face classification using electronic synapses , 2017, Nature Communications.

[16]  Mohammed A. Zidan,et al.  Parasitic Effect Analysis in Memristor-Array-Based Neuromorphic Systems , 2018, IEEE Transactions on Nanotechnology.

[17]  Yu Wang,et al.  MNSIM: Simulation Platform for Memristor-Based Neuromorphic Computing System , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[18]  H.-S. Philip Wong,et al.  In-memory computing with resistive switching devices , 2018, Nature Electronics.

[19]  H.-S. Philip Wong,et al.  Device and circuit optimization of RRAM for neuromorphic computing , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[20]  Qing Wu,et al.  Efficient and self-adaptive in-situ learning in multilayer memristor neural networks , 2018, Nature Communications.