High Area/Energy Efficiency RRAM CNN Accelerator with Kernel-Reordering Weight Mapping Scheme Based on Pattern Pruning

Resistive Random Access Memory (RRAM) is an emerging device for processing-in-memory (PIM) architecture to accelerate convolutional neural network (CNN). However, due to the highly coupled crossbar structure in the RRAM array, it is difficult to exploit the sparsity of the network in RRAM-based CNN accelerator. To optimize the weight mapping of sparse network in the RRAM array and achieve high area and energy efficiency, we propose a novel weight mapping scheme and corresponding RRAM-based CNN accelerator architecture based on pattern pruning and Operation Unit(OU) mechanism. Experimental results show that our work can achieve 4.16x-5.20x crossbar area efficiency, 1.98x-2.15x energy efficiency, and 1.15x-1.35x performance speedup in comparison with the traditional weight mapping method.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Yanzhi Wang,et al.  A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.

[3]  Yiran Chen,et al.  ReCom: An efficient resistive accelerator for compressed deep neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  Xiaochen Peng,et al.  Design Guidelines of RRAM based Neural-Processing-Unit: A Joint Device-Circuit-Algorithm Analysis , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[5]  Yuan Xie,et al.  Learning the sparsity for ReRAM: mapping and pruning sparse neural network for ReRAM based accelerator , 2019, ASP-DAC.

[6]  Wei Niu,et al.  PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices , 2020, AAAI.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  Meng-Fan Chang,et al.  A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[10]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[11]  Catherine Graves,et al.  Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[13]  Jingyu Wang,et al.  High PE Utilization CNN Accelerator with Channel Fusion Supporting Pattern-Compressed Sparse Neural Networks , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[14]  Yiran Chen,et al.  PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[15]  Song Han,et al.  Exploring the Regularity of Sparse Structure in Convolutional Neural Networks , 2017, ArXiv.

[16]  Yanzhi Wang,et al.  StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs , 2018, 1807.11091.

[17]  Huazhong Yang,et al.  AERIS: area/energy-efficient 1T2R ReRAM based processing-in-memory neural network system-on-a-chip , 2019, ASP-DAC.

[18]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Chia-Lin Yang,et al.  Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[21]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).