论文信息 - Improving the Accuracy and Hardware Efficiency of Neural Networks Using Approximate Multipliers

Improving the Accuracy and Hardware Efficiency of Neural Networks Using Approximate Multipliers

Improving the accuracy of a neural network (NN) usually requires using larger hardware that consumes more energy. However, the error tolerance of NNs and their applications allow approximate computing techniques to be applied to reduce implementation costs. Given that multiplication is the most resource-intensive and power-hungry operation in NNs, more economical approximate multipliers (AMs) can significantly reduce hardware costs. In this article, we show that using AMs can also improve the NN accuracy by introducing noise. We consider two categories of AMs: 1) deliberately designed and 2) Cartesian genetic programing (CGP)-based AMs. The exact multipliers in two representative NNs, a multilayer perceptron (MLP) and a convolutional NN (CNN), are replaced with approximate designs to evaluate their effect on the classification accuracy of the Mixed National Institute of Standards and Technology (MNIST) and Street View House Numbers (SVHN) data sets, respectively. Interestingly, up to 0.63% improvement in the classification accuracy is achieved with reductions of 71.45% and 61.55% in the energy consumption and area, respectively. Finally, the features in an AM are identified that tend to make one design outperform others with respect to NN accuracy. Those features are then used to train a predictor that indicates how well an AM is likely to work in an NN.

[1] Caro Lucas,et al. Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of Soft-Computing Applications , 2010, IEEE Transactions on Circuits and Systems I: Regular Papers.

[2] Fabrizio Lombardi,et al. Design and Analysis of Approximate Compressors for Multiplication , 2015, IEEE Transactions on Computers.

[3] Partha Pratim Pande,et al. A Spatial Multi-Bit Sub-1-V Time-Domain Matrix Multiplier Interface for Approximate Computing in 65-nm CMOS , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[4] Fabrizio Lombardi,et al. A low-power, high-performance approximate multiplier with configurable partial error recovery , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5] Pierre Geurts,et al. Extremely randomized trees , 2006, Machine Learning.

[6] Jie Han,et al. Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[7] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10] Steve R. Gunn,et al. Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[11] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[12] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[13] Yang Shao,et al. Comparison of Early Stopping Criteria for Neural-Network-Based Subpixel Classification , 2011, IEEE Geoscience and Remote Sensing Letters.

[14] Andrew Chi-Sing Leung,et al. Convergence Analyses on On-Line Weight Noise Injection-Based Training Algorithms for MLPs , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[15] Bruce F. Cockburn,et al. Low-Power Approximate Multipliers Using Encoded Partial Products and Approximate Compressors , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[16] Yoshua Bengio,et al. Neural Networks with Few Multiplications , 2015, ICLR.

[17] Kiat Seng Yeo,et al. Low-power high-speed multiplier for error-tolerant application , 2010, 2010 IEEE International Conference of Electron Devices and Solid-State Circuits (EDSSC).

[18] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[19] S. Simon Wong,et al. Analysis and Design of a Passive Switched-Capacitor Matrix Multiplier for Approximate Computing , 2017, IEEE Journal of Solid-State Circuits.

[20] Saibal Mukhopadhyay,et al. Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator , 2016, ISLPED.

[21] Kaushik Roy,et al. AxNN: Energy-efficient neuromorphic systems using approximate computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[22] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23] Lukás Sekanina,et al. Evolutionary Approach to Approximate Digital Circuits Design , 2015, IEEE Transactions on Evolutionary Computation.

[24] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .

[25] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[26] Andrew Chi-Sing Leung,et al. On the Selection of Weight Decay Parameter for Faulty Networks , 2010, IEEE Transactions on Neural Networks.

[27] Qiang Xu,et al. ApproxANN: An approximate computing framework for artificial neural network , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[28] Kaushik Roy,et al. Design of power-efficient approximate multipliers for approximate artificial neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[29] Kaushik Roy,et al. Significance driven hybrid 8T-6T SRAM for energy-efficient synaptic storage in artificial neural networks , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30] Kaushik Roy,et al. Energy-Efficient Neural Computing with Approximate Multipliers , 2018, ACM J. Emerg. Technol. Comput. Syst..

[31] Fabrizio Lombardi,et al. Approximate Radix-8 Booth Multipliers for Low-Power and High-Performance Operation , 2016, IEEE Transactions on Computers.

[32] Alan F. Murray,et al. Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.

[33] Sied Mehdi Fakhraie,et al. Efficient utilization of imprecise computational blocks for hardware implementation of imprecision tolerant applications , 2017, Microelectron. J..

[34] Puneet Gupta,et al. Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[35] Yu Zhang,et al. On training bi-directional neural network language model with noise contrastive estimation , 2016, 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP).

[36] Andrew Chi-Sing Leung,et al. Objective Functions of Online Weight Noise Injection Training Algorithms for MLPs , 2011, IEEE Transactions on Neural Networks.

[37] Fabrizio Lombardi,et al. A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits , 2017, ACM J. Emerg. Technol. Comput. Syst..

[38] Ing-Chao Lin,et al. High accuracy approximate multiplier with error correction , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[39] Francesco Piazza,et al. Fast neural networks without multipliers , 1993, IEEE Trans. Neural Networks.

[40] Naresh Nagabushan,et al. Effect of injected noise in deep neural networks , 2016, 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).

[41] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.