Layerwise Buffer Voltage Scaling for Energy-Efficient Convolutional Neural Network

In order to effectively reduce buffer energy consumption, which constitutes a significant part of the total energy consumption in a convolutional neural network (CNN), it is useful to apply different amounts of energy conservation effort to the different levels of a CNN as the buffer energy to total energy usage ratios can differ quite substantially across the layers of a CNN. This article proposes layerwise buffer voltage scaling as an effective technique for reducing buffer access energy. Error-resilience analysis, including interlayer effects, conducted during design-time is used to determine the specific buffer supply voltage to be used for each layer of a CNN. Then these layer-specific buffer supply voltages are used in the CNN for image classification inference. Error injection experiments with three different types of CNN architectures show that, with this technique, the buffer access energy and overall system energy can be reduced by up to 68.41% and 33.68%, respectively, without sacrificing image classification accuracy.

[1]  Gu-Yeon Wei,et al.  Ivory: Early-stage design space exploration tool for integrated voltage regulators , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Qiang Xu,et al.  ApproxANN: An approximate computing framework for artificial neural network , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  Sébastien Marcel,et al.  Torchvision the machine-vision package of torch , 2010, ACM Multimedia.

[4]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[6]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Rui Paulo Martins,et al.  20.4 A 123-phase DC-DC converter-ring with fast-DVS for microprocessors , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[9]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10]  Gu-Yeon Wei,et al.  Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[11]  Boris Murmann,et al.  SRAM voltage scaling for energy-efficient convolutional neural networks , 2017, 2017 18th International Symposium on Quality Electronic Design (ISQED).

[12]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[13]  Swaroop Ghosh,et al.  Sensitivity based Error Resilient Techniques for Energy Efficient Deep Neural Network Accelerators , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Joon-Sung Yang,et al.  DRIS-3: Deep Neural Network Reliability Improvement Scheme in 3D Die-Stacked Memory based on Fault Analysis , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[16]  Leibo Liu,et al.  Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[17]  Wei Wu,et al.  Energy-efficient cache design using variable-strength error-correcting codes , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[18]  Saibal Mukhopadhyay,et al.  Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  Onur Mutlu,et al.  The reach profiler (REAPER): Enabling the mitigation of DRAM retention failures via profiling at aggressive conditions , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[20]  Marian Verhelst,et al.  An Energy-Efficient Precision-Scalable ConvNet Processor in 40-nm CMOS , 2017, IEEE Journal of Solid-State Circuits.

[21]  Marian Verhelst,et al.  DVAFS: Trading computational accuracy for energy through dynamic-voltage-accuracy-frequency-scaling , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[22]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Fei Qiao,et al.  Evaluating Data Resilience in CNNs from an Approximate Memory Perspective , 2017, ACM Great Lakes Symposium on VLSI.

[24]  Jihyuck Jo,et al.  Energy-Efficient Convolution Architecture Based on Rescheduled Dataflow , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  David Blaauw,et al.  13.8 A 32kb SRAM for error-free and error-tolerant applications with dynamic energy-quality management in 28nm CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[27]  K. Roy,et al.  A 160 mV Robust Schmitt Trigger Based Subthreshold SRAM , 2007, IEEE Journal of Solid-State Circuits.

[28]  Thierry Moreau,et al.  Energy-Efficient Neural Network Acceleration in the Presence of Bit-Level Memory Errors , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[29]  Hoi-Jun Yoo,et al.  CNNP-v2: A Memory-Centric Architecture for Low-Power CNN Processor on Domain-Specific Mobile Devices , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[30]  Gu-Yeon Wei,et al.  DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications , 2018, IEEE Journal of Solid-State Circuits.

[31]  Johann W. Kolar,et al.  4.7 A sub-ns response on-chip switched-capacitor DC-DC voltage regulator delivering 3.7W/mm2 at 90% efficiency using deep-trench capacitors in 32nm SOI CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[32]  Dongyoung Kim,et al.  ZeNA: Zero-Aware Neural Network Accelerator , 2018, IEEE Design & Test.

[33]  Zhi Qi,et al.  AxDNN: towards the cross-layer design of approximate DNNs , 2019, ASP-DAC.

[34]  Gu-Yeon Wei,et al.  Ares: A framework for quantifying the resilience of deep neural networks , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[35]  Fei Qiao,et al.  Concrete: A Per-layer Configurable Framework for Evaluating DNN with Approximate Operators , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Muhammad Shafique,et al.  Error resilience analysis for systematically employing approximate computing in convolutional neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).