Diversifying Sample Generation for Accurate Data-Free Quantization

Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. Recently, data-free quantization has been widely studied as a practical and promising solution. It synthesizes data for calibrating the quantized model according to the batch normalization (BN) statistics of FP32 ones and significantly relieves the heavy dependency on real training data in traditional quantization methods. Unfortunately, we find that in practice, the synthetic data identically constrained by BN statistics suffers serious homogenization at both distribution level and sample level and further causes a significant performance drop of the quantized model. We propose Diverse Sample Generation (DSG) scheme to mitigate the adverse effects caused by homogenization. Specifically, we slack the alignment of feature statistics in the BN layer to relax the constraint at the distribution level and design a layerwise enhancement to reinforce specific layers for different data samples. Our DSG scheme is versatile and even able to be applied to the state-of-the-art post-training quantization method like AdaRound. We evaluate the DSG scheme on the large-scale image classification task and consistently obtain significant improvements over various network architectures and quantization methods, especially when quantized to lower bits (e.g., up to 22% improvement on W4A4). Moreover, benefiting from the enhanced diversity, models calibrated with synthetic data perform close to those calibrated with real data and even outperform them on W4A4.

[1]  Xianglong Liu,et al.  Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Wei Wu,et al.  Hierarchical Feature Embedding for Attribute Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Xianglong Liu,et al.  BiPointNet: Binary Neural Network for Point Clouds , 2020, ICLR.

[8]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Elad Hoffer,et al.  The Knowledge Within: Methods for Data-Free Model Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xianglong Liu,et al.  Occluded Prohibited Items Detection: An X-ray Security Inspection Benchmark and De-occlusion Attention Module , 2020, ACM Multimedia.

[11]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[12]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[13]  Kurt Keutzer,et al.  SqueezeNext: Hardware-Aware Neural Network Design , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yang Yang,et al.  BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction , 2021, ICLR.

[16]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Zhiru Zhang,et al.  Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.

[18]  Xianglong Liu,et al.  Forward and Backward Information Retention for Accurate Binary Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jihwan P. Choi,et al.  Data-Free Network Quantization With Adversarial Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Yan Wang,et al.  Fully Quantized Network for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[23]  Mingkui Tan,et al.  Generative Low-bitwidth Data Free Quantization , 2020, ECCV.

[24]  Wei Wu,et al.  Dynamic Curriculum Learning for Imbalanced Data Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Dan Alistarh,et al.  Model compression via distillation and quantization , 2018, ICLR.

[26]  Daniel Soudry,et al.  Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.

[27]  Derek Hoiem,et al.  Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Markus Nagel,et al.  Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Eunhyeok Park,et al.  Value-aware Quantization for Training and Inference of Neural Networks , 2018, ECCV.

[31]  Xiaolin Hu,et al.  Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[33]  Rana Ali Amjad,et al.  Up or Down? Adaptive Rounding for Post-Training Quantization , 2020, ICML.

[34]  Yoni Choukroun,et al.  Low-bit Quantization of Neural Networks for Efficient Inference , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[35]  Nicu Sebe,et al.  Binary Neural Networks: A Survey , 2020, Pattern Recognit..

[36]  Kurt Keutzer,et al.  ZeroQ: A Novel Zero Shot Quantization Framework , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Ian D. Reid,et al.  Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Xianglong Liu,et al.  Towards Unified INT8 Training for Convolutional Neural Network , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).