SBNN: Slimming binarized neural network

Abstract With the rapid developments of deep neural networks related applications, approaches for accelerating computationally intensive convolutional neural networks, such as network quantization, pruning, knowledge distillation, have attracted ever-increasing attention. Network binarization is an extreme form of network quantization technique, which binarizes the network weights and/or activation values to save computational resources. However, it often introduces noises into the network, and requires larger model size (more parameters) to compensate for the loss of representation capacity. To address the model complexity reduction challenges and further improve the network performance, this paper proposes an approach: slimming binarized neural networks (SBNN), which reduces complexity of binarized networks with acceptable accuracy loss. SBNN prunes the convolutional layers and fully-connected layer in a binarized network. Then it is refined by the proposed SoftSign function, knowledge distillation and full-precision computation to enhance the network accuracy. The proposed SBNN can be also conveniently applied to a pre-trained binarized network. We demonstrate the effectiveness of our approach through several state-of-the-art binarized models. For AlexNet and ResNet-18 on ILSVRC-2012 dataset, SBNN obtains negligible accuracy loss but even a better accuracy than the pre-pruning model while using only 75% of original filters.

[1]  Larry S. Davis,et al.  NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[3]  Wonyong Sung,et al.  Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..

[4]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Ebru Arisoy,et al.  Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Sanja Fidler,et al.  Gated-SCNN: Gated Shape CNNs for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Chuan Zhou,et al.  Big social network influence maximization via recursively estimating influence spread , 2016, Knowl. Based Syst..

[14]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[15]  Xindong Wu,et al.  Multi-Instance Learning with Discriminative Bag Mapping , 2018, IEEE Transactions on Knowledge and Data Engineering.

[16]  Hong Yang,et al.  Collaborative Social Group Influence for Event Recommendation , 2016, CIKM.

[17]  Tao Mei,et al.  daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices , 2019, ACM Multimedia.

[18]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[23]  Wei Liu,et al.  Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm , 2018, ECCV.

[24]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[25]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[26]  Leibo Liu,et al.  A 141 UW, 2.46 PJ/Neuron Binarized Convolutional Neural Network Based Self-Learning Speech Recognition Processor in 28NM CMOS , 2018, 2018 IEEE Symposium on VLSI Circuits.

[27]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[28]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[29]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Wei An,et al.  Binary Volumetric Convolutional Neural Networks for 3-D Object Recognition , 2019, IEEE Transactions on Instrumentation and Measurement.

[31]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Lizhuang Ma,et al.  Efficient Super Resolution Using Binarized Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[35]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Balaraman Ravindran,et al.  Studying the plasticity in deep convolutional neural networks using random pruning , 2018, Machine Vision and Applications.

[37]  Jia Wu,et al.  Artificial immune system for attribute weighted Naive Bayes classification , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[38]  Xin Dong,et al.  Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.

[39]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.