CAxCNN: Towards the Use of Canonic Sign Digit Based Approximation for Hardware-Friendly Convolutional Neural Networks

The design of hardware-friendly architectures with low computational overhead is desirable for low latency realization of CNN on resource-constrained embedded platforms. In this work, we propose CAxCNN, a Canonic Sign Digit (CSD) based approximation methodology for representing the filter weights of pre-trained CNNs.The proposed CSD representation allows the use of multipliers with reduced computational complexity. The technique can be applied on top of state-of-the-art CNN quantization schemes in a complementary manner. Our experimental results on a variety of CNNs, trained on MNIST, CIFAR-10 and ImageNet datasets, demonstrate that our methodology provides CNN designs with multiple levels of classification accuracy, without requiring any retraining, and while having a low area and computational overhead. Furthermore, when applied in conjunction with a state-of-art quantization scheme, CAxCNN allows the use of multipliers, which offer 77% logic area reduction, as compared to their accurate counterpart, while incurring a drop in Top-1 accuracy of just 5.63% for a VGG-16 network trained on ImageNet.

[1]  Michael Ferdman,et al.  Escher: A CNN Accelerator with Flexible Buffering to Minimize Off-Chip Transfer , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[2]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[3]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[4]  Swagath Venkataramani,et al.  DyHard-DNN: Even More DNN Acceleration with Dynamic Hardware Reconfiguration , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[5]  Tian Huang,et al.  FPGA based acceleration of game theory algorithm in edge computing for autonomous driving , 2019, J. Syst. Archit..

[6]  Soheil Ghiasi,et al.  Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.

[7]  Corey Lammie,et al.  Low-Power and High-Speed Deep FPGA Inference Engines for Weed Classification at the Edge , 2019, IEEE Access.

[8]  Nader Bagherzadeh,et al.  Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks , 2019, IEEE Transactions on Computers.

[9]  Younghyun Kim,et al.  SAADI: a scalable accuracy approximate divider for dynamic energy-quality scaling , 2019, ASP-DAC.

[10]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[11]  Massimo Alioto,et al.  Energy-Quality Scalable Integrated Circuits and Systems: Continuing Energy Scaling in the Twilight of Moore’s Law , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[12]  Jason Cong,et al.  Scaling for edge inference of deep neural networks , 2018 .

[13]  Junjun Jiang,et al.  Edge-Enhanced GAN for Remote Sensing Image Superresolution , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Mikko H. Lipasti,et al.  SECO: A Scalable Accuracy Approximate Exponential Function Via Cross-Layer Optimization , 2019, 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[15]  Gustavo A. Ruiz,et al.  Efficient canonic signed digit recoding , 2011, Microelectron. J..

[16]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[17]  Paris Smaragdis,et al.  Bitwise Neural Networks , 2016, ArXiv.

[18]  Tao Lu,et al.  Multi-Memory Convolutional Neural Network for Video Super-Resolution , 2019, IEEE Transactions on Image Processing.

[19]  Jalil Fadavi-Ardekani,et al.  M*N Booth encoded multiplier generator using optimized Wallace trees , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[20]  Sherief Reda,et al.  DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[23]  Fei Chen,et al.  When FPGA-Accelerator Meets Stream Data Processing in the Edge , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[24]  Zixiang Xiong,et al.  Separability and Compactness Network for Image Recognition and Superresolution , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Jiayi Ma,et al.  Multi-Temporal Ultra Dense Memory Network for Video Super-Resolution , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[29]  Muhammad Shafique,et al.  Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[30]  Soheil Ghiasi,et al.  Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Michel Paindavoine,et al.  Efficient Data Encoding for Convolutional Neural Network application , 2015, ACM Trans. Archit. Code Optim..

[32]  Muhammad Shafique,et al.  Adaptive and Energy-Efficient Architectures for Machine Learning: Challenges, Opportunities, and Research Roadmap , 2017, 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).