Normalization and dropout for stochastic computing-based deep convolutional neural networks

Abstract Recently, Deep Convolutional Neural Network (DCNN) has been recognized as the most effective model for pattern recognition and classification tasks. With the fast growing Internet of Things (IoTs) and wearable devices, it becomes attractive to implement DCNNs in embedded and portable systems. However, novel computing paradigms are urgently required to deploy DCNNs that have huge power consumptions and complex topologies in systems with limited area and power supply. Recent works have demonstrated that Stochastic Computing (SC) can radically simplify the hardware implementation of arithmetic units and has the potential to bring the success of DCNNs to embedded systems. This paper introduces normalization and dropout, which are essential techniques for the state-of-the-art DCNNs, to the existing SC-based DCNN frameworks. In this work, the feature extraction block of DCNNs is implemented using an approximate parallel counter, a near-max pooling block and an SC-based rectified linear activation unit. A novel SC-based normalization design is proposed, which includes a square and summation unit, an activation unit and a division unit. The dropout technique is integrated into the training phase and the learned weights are adjusted during the hardware implementation. Experimental results on AlexNet with the ImageNet dataset show that the SC-based DCNN with the proposed normalization and dropout techniques achieves 3.26% top-1 accuracy improvement and 3.05% top-5 accuracy improvement compared with the SC-based DCNN without these two essential techniques, confirming the effectiveness of our normalization and dropout designs.

[1]  Kiyoung Choi,et al.  Efficient FPGA acceleration of Convolutional Neural Networks using logical-3D compute array , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Ji Li,et al.  Accelerated Soft-Error-Rate (SER) Estimation for Combinational and Sequential Circuits , 2017, ACM Trans. Design Autom. Electr. Syst..

[3]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Soheil Ghiasi,et al.  Design space exploration of FPGA-based Deep Convolutional Neural Networks , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[9]  Ji Li,et al.  Fundamental Challenges Toward Making the IoT a Reachable Reality , 2017, ACM Trans. Design Autom. Electr. Syst..

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Ji Li,et al.  Softmax Regression Design for Stochastic Computing Based Deep Convolutional Neural Networks , 2017, ACM Great Lakes Symposium on VLSI.

[12]  Joel Emer,et al.  Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .

[13]  Howard C. Card,et al.  Stochastic Neural Computation I: Computational Elements , 2001, IEEE Trans. Computers.

[14]  Ji Li,et al.  Towards acceleration of deep convolutional neural networks using stochastic computing , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[15]  Jason Cong,et al.  Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[16]  Feng Ran,et al.  A hardware implementation of a radial basis function neural network using stochastic logic , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[18]  Howard C. Card,et al.  Stochastic Neural Computation II: Soft Competitive Learning , 2001, IEEE Trans. Computers.

[19]  Ji Li,et al.  Joint Soft-Error-Rate (SER) Estimation for Combinational Logic and Sequential Elements , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[20]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[21]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[22]  Ji Li,et al.  Structural design optimization for deep convolutional neural networks using stochastic computing , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Tara N. Sainath,et al.  Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Kiyoung Choi,et al.  Dynamic energy-accuracy trade-off using stochastic computing in deep neural networks , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[27]  Ji Li,et al.  DSCNN: Hardware-oriented optimization for Stochastic Computing based Deep Convolutional Neural Networks , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).

[28]  Qinru Qiu,et al.  SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing , 2016, ASPLOS.

[29]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[30]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.