Low-power implementation of Mitchell's approximate logarithmic multiplication for convolutional neural networks

This paper proposes a low-power implementation of the approximate logarithmic multiplier to improve the power consumption of convolutional neural networks for image classification, taking advantage of its intrinsic tolerance to error. The approximate logarithmic multiplier converts multiplications to additions by taking approximate logarithm and achieves significant improvement in power and area while having low worst-case error, which makes it suitable for neural network computation. Our proposed design shows a significant improvement in terms of power and area over the previous work that applied logarithmic multiplication to neural networks, reducing power up to 76.6% compared to exact fixed-point multiplication, while maintaining comparable prediction accuracy in convolutional neural networks for MNIST and CIFAR10 datasets.

[1]  Olivier Temam,et al.  Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[2]  John N. Mitchell,et al.  Computer Multiplication and Division Using Binary Logarithms , 1962, IRE Trans. Electron. Comput..

[3]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[4]  Patricio Bulic,et al.  An iterative logarithmic multiplier , 2011, Microprocess. Microsystems.

[5]  Kaushik Roy,et al.  Design of power-efficient approximate multipliers for approximate artificial neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[6]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[7]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[8]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Patricio Bulic,et al.  Applicability of approximate multipliers in hardware neural networks , 2012, Neurocomputing.

[13]  Khalid H. Abed,et al.  CMOS VLSI Implementation of a Low-Power Logarithmic Converter , 2003, IEEE Trans. Computers.

[14]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  N. Ranganathan,et al.  An efficient and accurate logarithmic multiplier based on operand decomposition , 2006, 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06).

[16]  Harold S. Stone,et al.  A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.

[17]  Fabrizio Lombardi,et al.  Design of Approximate Logarithmic Multipliers , 2017, ACM Great Lakes Symposium on VLSI.

[18]  Kaushik Roy,et al.  Multiplier-less Artificial Neurons exploiting error resiliency for energy-efficient neural computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).