SRAM voltage scaling for energy-efficient convolutional neural networks

State-of-the-art convolutional neural networks (ConvNets) are now able to achieve near human performance on a wide range of classification tasks. Unfortunately, current hardware implementations of ConvNets are memory power intensive, prohibiting deployment in low-power embedded systems and IoE platforms. One method of reducing memory power is to exploit the error resilience of ConvNets and accept bit errors under reduced supply voltages. In this paper, we extensively study the effectiveness of this idea and show that further savings are possible by injecting bit errors during ConvNet training. Measurements on an 8KB SRAM in 28nm UTBB FD-SOI CMOS demonstrate supply voltage reduction of 310mV, which results in up to 5.4× leakage power reduction and up to 2.9× memory access power reduction at 99% of floating-point classification accuracy, with no additional hardware cost. To our knowledge, this is the first silicon-validated study on the effect of bit errors in ConvNets.

[1]  Eugenio Culurciello,et al.  Robust Convolutional Neural Networks under Adversarial Noise , 2015, ArXiv.

[2]  Daisuke Miyashita,et al.  Mixed-signal circuits for embedded machine-learning applications , 2015, 2015 49th Asilomar Conference on Signals, Systems and Computers.

[3]  Marian Verhelst,et al.  A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).

[4]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[5]  Marian Verhelst,et al.  Energy-efficient ConvNets through approximate computing , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  Kaushik Roy,et al.  Significance driven hybrid 8T-6T SRAM for energy-efficient synaptic storage in artificial neural networks , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[8]  Jason Schlessman,et al.  Reconfigurable SRAM Architecture With Spatial Voltage Scaling for Low Power Mobile Multimedia Applications , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[9]  David Blaauw,et al.  8 A 32 kb SRAM for Error-Free and Error-Tolerant Applications with Dynamic Energy-Quality Management in 28 nm CMOS , 2018 .

[10]  Jongsun Park,et al.  Heterogeneous SRAM Cell Sizing for Low-Power H.264 Applications , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[11]  B. Otis,et al.  PicoRadios for wireless sensor networks: the next challenge in ultra-low power design , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[12]  David Blaauw,et al.  13.8 A 32kb SRAM for error-free and error-tolerant applications with dynamic energy-quality management in 28nm CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[13]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[14]  Erich Elsen,et al.  Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.

[15]  Gu-Yeon Wei,et al.  Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.