论文信息 - A hybrid precision low power computing-in-memory architecture for neural networks

A hybrid precision low power computing-in-memory architecture for neural networks

Abstract Recently, non-volatile memory-based computing-in-memory has been regarded as a promising competitor to ultra-low-power AI chips. Implementations based on both binarized (BIN) and multi-bit (MB) schemes are proposed for DNNs/CNNs. However, there are challenges in accuracy and power efficiency in the practical use of both schemes. This paper proposes a hybrid precision architecture and circuit-level techniques to overcome these challenges. According to measured experimental results, a test chip based on the proposed architecture achieves (1) from binarized weights and inputs up to 8-bit input, 5-bit weight, and 7-bit output, (2) an accuracy loss reduction of from 86% to 96% for multiple complex CNNs, and (3) a power efficiency of 2.15TOPS/W based on a 0.22μm CMOS process which greatly reduces costs compared to digital designs with similar power efficiency. With a more advanced process, the architecture can achieve a higher power efficiency. According to our estimation, a power efficiency of over 20TOPS/W can be achieved with a 55nm CMOS process.

[1] F. Merrikh Bayat,et al. Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[2] Shaahin Angizi,et al. Energy Efficient In-Memory Binary Deep Neural Network Accelerator with Dual-Mode SOT-MRAM , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[3] Ian O'Connor,et al. Computing with ferroelectric FETs: Devices, models, systems, and applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[5] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[6] Meng-Fan Chang,et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[7] Yvon Savaria,et al. Reconfigurable pipelined 2-D convolvers for fast digital signal processing , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[8] Gokhan Memik,et al. Thermal-aware Optimizations of ReRAM-based Neuromorphic Computing Systems , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Shaahin Angizi,et al. ParaPIM: a parallel processing-in-memory accelerator for binary-weight deep neural networks , 2019, ASP-DAC.

[11] Farnood Merrikh-Bayat,et al. High-Performance Mixed-Signal Neurocomputing With Nanoscale Floating-Gate Memory Cell Arrays , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[12] Meng-Fan Chang,et al. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[13] Jing Xia,et al. DaVinci: A Scalable Architecture for Neural Network Computing , 2019, 2019 IEEE Hot Chips 31 Symposium (HCS).

[14] Dmitri Strukov,et al. An Ultra-Low Energy Internally Analog, Externally Digital Vector-Matrix Multiplier Based on NOR Flash Memory Technology , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[15] Hua Fan,et al. Calibrating Process Variation at System Level with In-Situ Low-Precision Transfer Learning for Analog Neural Network Processors , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[16] Ryutaro Yasuhara,et al. A 4M Synapses integrated Analog ReRAM based 66.5 TOPS/W Neural-Network Processor with Cell Current Controlled Writing and Flexible Network Architecture , 2018, 2018 IEEE Symposium on VLSI Technology.

[17] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.

[18] Ronald F. DeMara,et al. ApGAN: Approximate GAN for Robust Low Energy Learning From Imprecise Components , 2020, IEEE Transactions on Computers.

[19] Anand Raghunathan,et al. Computing in Memory With Spin-Transfer Torque Magnetic RAM , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20] Runze Han,et al. A Novel Convolution Computing Paradigm Based on NOR Flash Array with High Computing Speed and Energy Efficient , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).