Searching for Low-Bit Weights in Quantized Neural Networks

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters (i.e., 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected to establish the desired quantized network. Experimental results on benchmarks demonstrate that the proposed method is able to produce quantized neural networks with higher performance over the state-of-the-art methods on both image classification and super-resolution tasks.

[1]  Rongrong Ji,et al.  Cogradient Descent for Bilinear Optimization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[3]  Rongrong Ji,et al.  Holistic CNN Compression via Low-Rank Decomposition with Knowledge Transfer , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[6]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[7]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[9]  Larry S. Davis,et al.  BlockDrop: Dynamic Inference Paths in Residual Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Chao Xu,et al.  Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks , 2020, AAAI.

[11]  Zhaohui Yang,et al.  Adapting Neural Architectures Between Domains , 2020, NeurIPS.

[12]  G. Hua,et al.  LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks , 2018, ECCV.

[13]  David S. Doermann,et al.  Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation , 2018, AAAI.

[14]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ling Shao,et al.  TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights , 2018, ECCV.

[17]  Philip H. S. Torr,et al.  Proximal Mean-Field for Neural Network Quantization , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Horst Bischof,et al.  Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[20]  Zhiqiang Shen,et al.  MoBiNet: A Mobile Binary Network for Image Classification , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[21]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[22]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[23]  Xianglong Liu,et al.  Balanced Binary Neural Networks with Gated Residual , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Dacheng Tao,et al.  On Compressing Deep Models by Low Rank and Sparse Decomposition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Sachin S. Talathi,et al.  Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[26]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[27]  Haishan Ye,et al.  MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jiwen Lu,et al.  Learning Channel-Wise Interactions for Binary Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Kwang-Ting Cheng,et al.  Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization , 2019, NeurIPS.

[30]  Song Han,et al.  Trained Ternary Quantization , 2016, ICLR.

[31]  Cheng Deng,et al.  Binarized Neural Networks for Resource-Efficient Hashing with Minimizing Quantization Loss , 2019, IJCAI.

[32]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[33]  Bingbing Ni,et al.  Performance Guaranteed Network Acceleration via High-Order Residual Quantization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Wei Pan,et al.  Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[35]  Wei Liu,et al.  Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm , 2018, ECCV.

[36]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[37]  Xianglong Liu,et al.  Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Jiahui Yu,et al.  AutoSlim: Towards One-Shot Architecture Search for Channel Numbers , 2019 .

[39]  Georgios Tzimiropoulos,et al.  BATS: Binary ArchitecTure Search , 2020, ECCV.

[40]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[41]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[42]  Chao Zhang,et al.  Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[44]  Philip Heng Wai Leong,et al.  SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[46]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Ian D. Reid,et al.  Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Jian Sun,et al.  Deep Learning with Low Precision by Half-Wave Gaussian Quantization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Diana Marculescu,et al.  Regularizing Activation Distribution for Training Binarized Deep Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Ji Liu,et al.  GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework , 2020, ECCV.

[51]  Rongrong Ji,et al.  Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[53]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[54]  Yunhe Wang,et al.  Neural Architecture Search in A Proxy Validation Loss Landscape , 2020, ICML.

[55]  Xin Dong,et al.  Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Quoc V. Le,et al.  BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models , 2020, ECCV.

[57]  Enhua Wu,et al.  Training Binary Neural Networks through Learning with Noisy Supervision , 2020, ICML.

[58]  Qi Tian,et al.  A Semi-Supervised Assessor of Neural Architectures , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Christoph Meinel,et al.  Back to Simplicity: How to Train Accurate BNNs from Scratch? , 2019, ArXiv.

[60]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[61]  Chao Xu,et al.  LegoNet: Efficient Convolutional Neural Networks with Lego Filters , 2019, ICML.

[62]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[63]  Dacheng Tao,et al.  Positive-Unlabeled Compression on the Cloud , 2019, NeurIPS.

[64]  Kwang-Ting Cheng,et al.  ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions , 2020, ECCV.

[65]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[66]  Rongrong Ji,et al.  Bayesian Optimized 1-Bit CNNs , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[67]  Jonghyun Choi,et al.  Learning Architectures for Binary Networks , 2020, ECCV.

[68]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Kai Han,et al.  Searching for Accurate Binary Neural Architectures , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[70]  Chao Xu,et al.  Reborn Filters: Pruning Convolutional Neural Networks with Limited Data , 2020, AAAI.

[71]  Shaojin Ding,et al.  AutoSpeech: Neural Architecture Search for Speaker Recognition , 2020, INTERSPEECH.

[72]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Aline Roumy,et al.  Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[74]  Jie Zhou,et al.  MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation , 2020, ECCV.

[75]  Qi Tian,et al.  Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[76]  Dacheng Tao,et al.  Distilling Knowledge From Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).