Energy-efficient ConvNets through approximate computing

Recently convolutional neural networks (ConvNets) have come up as state-of-the-art classification and detection algorithms, achieving near-human performance in visual detection. However, ConvNet algorithms are typically very computation and memory intensive. In order to be able to embed ConvNet-based classification into wearable platforms and embedded systems such as smartphones or ubiquitous electronics for the internet-of-things, their energy consumption should be reduced drastically. This paper proposes methods based on approximate computing to reduce energy consumption in state-of-the-art ConvNet accelerators. By combining techniques both at the system- and circuit level, we can gain energy in the systems arithmetic: up to 30× without losing classification accuracy and more than 100× at 99% classification accuracy, compared to the commonly used 16-bit fixed point number format.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Kaushik Roy,et al.  AxNN: Energy-efficient neuromorphic systems using approximate computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[4]  Günhan Dündar,et al.  The effects of quantization on multilayer neural networks , 1995, IEEE Trans. Neural Networks.

[5]  Luca Benini,et al.  A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[7]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[8]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Wei Han,et al.  A Flexible Low Power DSP With a Programmable Truncated Multiplier , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Henk Corporaal,et al.  Memory-centric accelerator design for Convolutional Neural Networks , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[13]  Georges G. E. Gielen,et al.  The Effects of Quantization on Multi-Layer Feedforward Neural Networks , 2003, Int. J. Pattern Recognit. Artif. Intell..

[14]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Marian Verhelst,et al.  DVAS: Dynamic Voltage Accuracy Scaling for increased energy-efficiency in approximate computing , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[17]  Ninghui Sun,et al.  DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[18]  Kaushik Roy,et al.  SALSA: Systematic logic synthesis of approximate circuits , 2012, DAC Design Automation Conference 2012.

[19]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.

[20]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[21]  Shengen Yan,et al.  Deep Image: Scaling up Image Recognition , 2015, ArXiv.