论文信息 - DFQF: Data Free Quantization-aware Fine-tuning

DFQF: Data Free Quantization-aware Fine-tuning

Data free deep neural network quantization is a practical challenge, since the original training data is often unavailable due to some privacy, proprietary or transmission issues. The existing methods implicitly equate data-free with training-free and quantize model manually through analyzing the weights’ distribution. It leads to a significant accuracy drop in lower than 6-bit quantization. In this work, we propose the data free quantizationaware fine-tuning (DFQF), wherein no real training data is required, and the quantized network is fine-tuned with generated images. Specifically, we start with training a generator from the pre-trained full-precision network with inception score loss, batch-normalization statistics loss and adversarial loss to synthesize a fake image set. Then we fine-tune the quantized student network with the full-precision teacher network and the generated images by utilizing knowledge distillation (KD). The proposed DFQF outperforms state-of-theart post-train quantization methods, and achieve W4A4 quantization of ResNet20 on the CIFAR10 dataset within 1% accuracy drop.

[1] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[2] Asit K. Mishra,et al. Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy , 2017, ICLR.

[3] Daniel Soudry,et al. Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.

[4] Elad Hoffer,et al. The Knowledge Within: Methods for Data-Free Model Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Raghuraman Krishnamoorthi,et al. Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.

[6] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.

[7] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[8] Dan Alistarh,et al. Model compression via distillation and quantization , 2018, ICLR.

[9] Markus Nagel,et al. Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[11] Xinchao Wang,et al. Data-Free Adversarial Distillation , 2019, ArXiv.

[12] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[13] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[15] Niraj K. Jha,et al. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] D. Dowson,et al. The Fréchet distance between multivariate normal distributions , 1982 .

[17] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[19] Qi Tian,et al. Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Kurt Keutzer,et al. ZeroQ: A Novel Zero Shot Quantization Framework , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Wonyong Sung,et al. Resiliency of Deep Neural Networks under Quantization , 2015, ArXiv.

[22] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[23] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[24] Swagath Venkataramani,et al. PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[25] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[26] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[27] Rich Caruana,et al. Model compression , 2006, KDD '06.

[28] Thad Starner,et al. Data-Free Knowledge Distillation for Deep Neural Networks , 2017, ArXiv.

[29] Zhiru Zhang,et al. Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.

[30] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Rishi Sharma,et al. A Note on the Inception Score , 2018, ArXiv.

[32] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.