Generative Zero-shot Network Quantization

Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration [66]. We show that, for high-level image recognition tasks, we can further reconstruct “realistic” images of each category by leveraging intrinsic Batch Normalization (BN) statistics without any training data. Inspired by the popular VAE/GAN methods, we regard the zero-shot optimization process of synthetic images as generative modeling to match the distribution of BN statistics. The generated images serve as a calibration set for the following zero-shot network quantizations. Our method meets the needs for quantizing models based on sensitive information, e.g., due to privacy concerns, no data is available. Extensive experiments on benchmark datasets show that, with the help of generated data, our approach consistently outperforms existing data-free quantization methods.

[1]  Tal Grossman,et al.  The CHIR Algorithm for Feed Forward Networks with Binary Weights , 1989, NIPS.

[2]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[3]  Paul Smolensky,et al.  Grammar-based connectionist approaches to language , 1999, Cogn. Sci..

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  Aaron C. Courville,et al.  What Do Compressed Deep Neural Networks Forget , 2019, 1911.05248.

[6]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[8]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[9]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[10]  Atul Prakash,et al.  MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[12]  Thad Starner,et al.  Data-Free Knowledge Distillation for Deep Neural Networks , 2017, ArXiv.

[13]  Song Han,et al.  Trained Ternary Quantization , 2016, ICLR.

[14]  Zhiru Zhang,et al.  Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.

[15]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[16]  Kartikeya Bhardwaj,et al.  Dream Distillation: A Data-Independent Model Compression Framework , 2019, ArXiv.

[17]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[18]  Qiang Chen,et al.  Unsupervised Network Quantization via Fixed-Point Factorization , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Alexander Finkelstein,et al.  Fighting Quantization Bias With Bias , 2019, ArXiv.

[20]  Amos Storkey,et al.  Zero-shot Knowledge Transfer via Adversarial Belief Matching , 2019, NeurIPS.

[21]  Derek Hoiem,et al.  Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Markus Nagel,et al.  Data-Free Quantization Through Weight Equalization and Bias Correction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[24]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[25]  K. Aldape,et al.  Precision histology: how deep learning is poised to revitalize histomorphology for personalized cancer care , 2017, npj Precision Oncology.

[26]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[27]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[28]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[29]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[30]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[31]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[32]  Max Welling,et al.  Batch-shaping for learning conditional channel gated networks , 2019, ICLR.

[33]  Bin Liu,et al.  Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[35]  K. Asanovi Experimental Determination of Precision Requirements for Back-propagation Training of Artiicial Neural Networks , 1991 .

[36]  Tim Dettmers,et al.  8-Bit Approximations for Parallelism in Deep Learning , 2015, ICLR.

[37]  Emily Denton,et al.  Characterising Bias in Compressed Models , 2020, ArXiv.

[38]  Raghuraman Krishnamoorthi,et al.  Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.

[39]  Elad Hoffer,et al.  The Knowledge Within: Methods for Data-Free Model Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Mingkui Tan,et al.  Generative Low-bitwidth Data Free Quantization , 2020, ECCV.

[41]  Xiaolin Hu,et al.  Interpret Neural Networks by Identifying Critical Data Routing Paths , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Tzi-Dar Chiueh,et al.  Learning algorithms for neural networks with ternary weights , 1988, Neural Networks.

[43]  Kurt Keutzer,et al.  ZeroQ: A Novel Zero Shot Quantization Framework , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Georgios Tzimiropoulos,et al.  Training Binary Neural Networks with Real-to-Binary Convolutions , 2020, ICLR.

[45]  Wei Pan,et al.  Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[46]  Elad Hoffer,et al.  ACIQ: Analytical Clipping for Integer Quantization of neural networks , 2018, ArXiv.

[47]  Wonyong Sung,et al.  Resiliency of Deep Neural Networks under Quantization , 2015, ArXiv.

[48]  Jian Cheng,et al.  Learning Compression from Limited Unlabeled Data , 2018, ECCV.

[49]  Jihwan P. Choi,et al.  Data-Free Network Quantization With Adversarial Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[51]  Qingshan Liu,et al.  ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices , 2020, ECCV.

[52]  Rana Ali Amjad,et al.  Up or Down? Adaptive Rounding for Post-Training Quantization , 2020, ICML.

[53]  David Saad,et al.  Training a network with ternary weights using the CHIR algorithm , 1993, IEEE Trans. Neural Networks.

[54]  Yoni Choukroun,et al.  Low-bit Quantization of Neural Networks for Efficient Inference , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[55]  Cordelia Schmid,et al.  How good is my GAN? , 2018, ECCV.

[56]  Max Welling,et al.  Relaxed Quantization for Discretized Neural Networks , 2018, ICLR.

[57]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[58]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[59]  Hyungjun Kim,et al.  BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations , 2020, ICLR.

[60]  Daniel Soudry,et al.  Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.

[61]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[63]  Nick Cammarata,et al.  Zoom In: An Introduction to Circuits , 2020 .

[64]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[65]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[66]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[67]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Max Welling,et al.  Probabilistic Binary Neural Networks , 2018, ArXiv.

[69]  Lin Xu,et al.  Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[70]  R. Venkatesh Babu,et al.  Zero-Shot Knowledge Distillation in Deep Networks , 2019, ICML.

[71]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[72]  Qi Tian,et al.  Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[73]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[74]  U Kang,et al.  Knowledge Extraction with No Observable Data , 2019, NeurIPS.

[75]  Sridhar Mahadevan,et al.  Generative Multi-Adversarial Networks , 2016, ICLR.