Structured Binary Neural Networks for Image Recognition

We propose methods to train convolutional neural networks (CNNs) with both binarized weights and activations, leading to quantized models that are specifically friendly to mobile devices with limited power capacity and computation resources. Previous works on quantizing CNNs often seek to approximate the floating-point information using a set of discrete values, which we call value approximation, typically assuming the same architecture as the full-precision networks. Here we take a novel "structure approximation" view of quantization---it is very likely that different architectures designed for low-bit networks may be better for achieving good performance. In particular, we propose a "network decomposition" strategy, termed Group-Net, in which we divide the network into groups. Thus, each full-precision group can be effectively reconstructed by aggregating a set of homogeneous binary branches. In addition, we learn effective connections among groups to improve the representation capability. Moreover, the proposed Group-Net shows strong generalization to other tasks. For instance, we extend Group-Net for accurate semantic segmentation by embedding rich context into the binary structure. Furthermore, for the first time, we apply binary neural networks to object detection. Experiments on both classification, semantic segmentation and object detection tasks demonstrate the superior performance of the proposed methods over various quantized networks in the literature. Our methods outperform the previous best binary neural networks in terms of accuracy and computation efficiency.

[1]  Philip Heng Wai Leong,et al.  FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.

[2]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[3]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[4]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[7]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[8]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Zhijian Liu,et al.  HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Tao Mei,et al.  daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices , 2019, ACM Multimedia.

[12]  Jian Sun,et al.  ExFuse: Enhancing Feature Fusion for Semantic Segmentation , 2018, ECCV.

[13]  Tony X. Han,et al.  Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.

[14]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[15]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[16]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Shwetak N. Patel,et al.  Heterogeneous Bitwidth Binarization in Convolutional Neural Networks , 2018, NeurIPS.

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Avi Mendelson,et al.  UNIQ , 2018, ACM Trans. Comput. Syst..

[20]  Christoph Meinel,et al.  BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet , 2017, ACM Multimedia.

[21]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Linda G. Shapiro,et al.  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[23]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[24]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ian D. Reid,et al.  Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Viktor K. Prasanna,et al.  Analysis of high-performance floating-point arithmetic on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[29]  Ke Wang,et al.  AI Benchmark: Running Deep Neural Networks on Android Smartphones , 2018, ECCV Workshops.

[30]  Wei Liu,et al.  Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm , 2018, ECCV.

[31]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[32]  Andreas Ehliar Area efficient floating-point adder and multiplier with IEEE-754 compatible semantics , 2014, 2014 International Conference on Field-Programmable Technology (FPT).

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Georgios Tzimiropoulos,et al.  Training Binary Neural Networks with Real-to-Binary Convolutions , 2020, ICLR.

[35]  Wei Pan,et al.  Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[36]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[37]  Jae-Joon Han,et al.  Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[39]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Junjie Yan,et al.  Quantization Mimic: Towards Very Tiny CNN for Object Detection , 2018, ECCV.

[41]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[42]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[44]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[45]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[46]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Lin Xu,et al.  Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[49]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[50]  Hao Chen,et al.  Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Gang Hua,et al.  How to Train a Compact Binary Neural Network with High Accuracy? , 2017, AAAI.

[52]  Yan Wang,et al.  Fully Quantized Network for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[54]  Yu Bai,et al.  ProxQuant: Quantized Neural Networks via Proximal Operators , 2018, ICLR.

[55]  Zhenzhi Wu,et al.  GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework , 2017, Neural Networks.

[56]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Christoph Meinel,et al.  Learning to Train a Binary Neural Network , 2018, ArXiv.

[58]  Dan Alistarh,et al.  Model compression via distillation and quantization , 2018, ICLR.

[59]  Bingbing Ni,et al.  Performance Guaranteed Network Acceleration via High-Order Residual Quantization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[60]  James T. Kwok,et al.  Loss-aware Binarization of Deep Networks , 2016, ICLR.

[61]  Iasonas Kokkinos,et al.  Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs , 2016, ECCV.

[62]  James T. Kwok,et al.  Loss-aware Weight Quantization of Deep Networks , 2018, ICLR.

[63]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[64]  Linda G. Shapiro,et al.  ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[66]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[68]  Christoph Meinel,et al.  Training Competitive Binary Neural Networks from Scratch , 2018, ArXiv.

[69]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[70]  Haichen Shen,et al.  TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.

[71]  Liujuan Cao,et al.  Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Ian D. Reid,et al.  Towards Effective Low-Bitwidth Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[76]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[77]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[78]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[79]  Xin Dong,et al.  Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[81]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[82]  Asit K. Mishra,et al.  Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy , 2017, ICLR.

[83]  Yurong Chen,et al.  Network Sketching: Exploiting Binary Structure in Deep CNNs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[85]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[86]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[87]  Wei Liu,et al.  Bi-Real Net: Binarizing Deep Network Towards Real-Network Performance , 2018, International Journal of Computer Vision.

[88]  Philip Heng Wai Leong,et al.  SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[89]  Bin Liu,et al.  Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[90]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[91]  Wei Liu,et al.  High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[92]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[93]  Eunhyeok Park,et al.  Weighted-Entropy-Based Quantization for Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Diana Marculescu,et al.  Regularizing Activation Distribution for Training Binarized Deep Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[95]  G. Hua,et al.  LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks , 2018, ECCV.

[96]  Song Han,et al.  Trained Ternary Quantization , 2016, ICLR.

[97]  Yuandong Tian,et al.  Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search , 2018, ArXiv.

[98]  Jian Sun,et al.  Deep Learning with Low Precision by Half-Wave Gaussian Quantization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[99]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[100]  Ling Shao,et al.  TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights , 2018, ECCV.