论文信息 - Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks

Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks

This paper proposes energy-efficient approximate multipliers based on the Mitchell’s log multiplication, optimized for performing inferences on convolutional neural networks (CNN). Various design techniques are applied to the log multiplier, including a fully-parallel LOD, efficient shift amount calculation, and exact zero computation. Additionally, the truncation of the operands is studied to create the customizable log multiplier that further reduces energy consumption. The paper also proposes using the one’s complements to handle negative numbers, as an approximation of the two’s complements that had been used in the prior works. The viability of the proposed designs is supported by the detailed formal analysis as well as the experimental results on CNNs. The experiments also provide insights into the effect of approximate multiplication in CNNs, identifying the importance of minimizing the range of error.The proposed customizable design at $w$w = 8 saves up to 88 percent energy compared to the exact fixed-point multiplier at 32 bits with just a performance degradation of 0.2 percent for the ImageNet ILSVRC2012 dataset.

[1] Kaushik Roy,et al. Low-power approximate convolution computing unit with domain-wall motion based “Spin-Memristor” for image processing applications , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[2] Olivier Temam,et al. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[3] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[5] Bruce F. Cockburn,et al. Low-Power Approximate Multipliers Using Encoded Partial Products and Approximate Compressors , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[6] Kiat Seng Yeo,et al. Low-power high-speed multiplier for error-tolerant application , 2010, 2010 IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC).

[7] N. Ranganathan,et al. An efficient and accurate logarithmic multiplier based on operand decomposition , 2006, 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06).

[8] Saibal Mukhopadhyay,et al. A Power-Aware Digital Multilayer Perceptron Accelerator with On-Chip Training Based on Approximate Computing , 2017, IEEE Transactions on Emerging Topics in Computing.

[9] Fabrizio Lombardi,et al. Design and Evaluation of Approximate Logarithmic Multipliers for Low Power Error-Tolerant Applications , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[10] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11] Patricio Bulic,et al. Applicability of approximate multipliers in hardware neural networks , 2012, Neurocomputing.

[12] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[13] Fabrizio Lombardi,et al. Design of Approximate Logarithmic Multipliers , 2017, ACM Great Lakes Symposium on VLSI.

[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Taejoon Park,et al. Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[16] Kaushik Roy,et al. Design of power-efficient approximate multipliers for approximate artificial neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[17] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[20] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[21] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[22] Puneet Gupta,et al. Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[23] Mehdi Kamal,et al. RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efficient Digital Signal Processing , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24] Román Hermida,et al. Low-power implementation of Mitchell's approximate logarithmic multiplication for convolutional neural networks , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[25] Sherief Reda,et al. DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[26] Kaushik Roy,et al. Multiplier-less Artificial Neurons exploiting error resiliency for energy-efficient neural computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[27] Khalid H. Abed,et al. CMOS VLSI Implementation of a Low-Power Logarithmic Converter , 2003, IEEE Trans. Computers.

[28] Igor Carron,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[29] Fabrizio Lombardi,et al. Design and Analysis of Approximate Compressors for Multiplication , 2015, IEEE Transactions on Computers.

[30] John N. Mitchell,et al. Computer Multiplication and Division Using Binary Logarithms , 1962, IRE Trans. Electron. Comput..

[31] Fabrizio Lombardi,et al. A low-power, high-performance approximate multiplier with configurable partial error recovery , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).