IMC Architecture for Robust DNN Acceleration

RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The non-ideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. We design a silicon prototype of the proposed hybrid IMC architecture in the 65nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows up to 21.9%, and 6.5% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead for CIFAR-10 and ImageNet datasets, respectively.

[1]  N. Cady,et al.  Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration , 2022, IEEE Transactions on Computers.

[2]  Sumit K. Mandal,et al.  COIN: Communication-Aware In-Memory Acceleration for Graph Convolutional Networks , 2022, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[3]  Partha Pratim Pande,et al.  Multi-Objective Optimization of ReRAM Crossbars for Robust DNN Inferencing under Stochastic Noise , 2021, 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD).

[4]  Chaitali Chakrabarti,et al.  Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks , 2021, ACM J. Emerg. Technol. Comput. Syst..

[5]  Rajiv V. Joshi,et al.  Robust RRAM-based In-Memory Computing in Light of Model Stability , 2021, 2021 IEEE International Reliability Physics Symposium (IRPS).

[6]  Weikang Qian,et al.  Unary Coding and Variation-Aware Optimal Mapping Scheme for Reliable ReRAM-Based Neuromorphic Computing , 2021, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Umit Y. Ogras,et al.  A Latency-Optimized Reconfigurable NoC for In-Memory Acceleration of DNNs , 2020, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[8]  Rajiv V. Joshi,et al.  Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[9]  Yu Cao,et al.  Interconnect-Aware Area and Energy Optimization for In-Memory Acceleration of DNNs , 2020, IEEE Design & Test.

[10]  Li Jiang,et al.  Go Unary: A Novel Synapse Coding and Mapping Scheme for Reliable ReRAM-based Neuromorphic Computing , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Saibal Mukhopadhyay,et al.  Design of Reliable DNN Accelerator with Un-reliable ReRAM , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Vivienne Sze,et al.  Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[13]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[14]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[15]  P. H. Demp Statistical data. , 1994, Journal of the American Podiatric Medical Association.