Microdosing: Knowledge Distillation for GAN based Compression

Recently, significant progress has been made in learned image and video compression. In particular, the usage of Generative Adversarial Networks has led to impressive results in the low bit rate regime. However, the model size remains an important issue in current state-of-the-art proposals, and existing solutions require significant computation effort on the decoding side. This limits their usage in realistic scenarios and the extension to video compression. In this paper, we demonstrate how to leverage knowledge distillation to obtain equally capable image decoders at a fraction of the original number of parameters. We investigate several aspects of our solution including sequence specialization with side information for image coding. Finally, we also show how to transfer the obtained benefits into the setting of video compression. Altogether, our proposal allows to reduce a decoder model size by a factor of 20 and to achieve 50% reduction in decoding time.

[1]  R. Manmatha,et al.  Deep Perceptual Compression , 2019, ArXiv.

[2]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[3]  Pierrick Philippe,et al.  Conditional Coding for Flexible Learned Video Compression , 2021 .

[4]  Yash Patel,et al.  Human Perceptual Evaluations for Image Compression , 2019, ArXiv.

[5]  Wei An,et al.  Unsupervised Degradation Representation Learning for Blind Super-Resolution , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Alexander Kolesnikov,et al.  Knowledge distillation: A good teacher is patient and consistent , 2021, ArXiv.

[7]  Markus Gross,et al.  Lossy Image Compression with Normalizing Flows , 2020, ArXiv.

[8]  Abdelaziz Djelouah,et al.  Neural Inter-Frame Compression for Video Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Luc Van Gool,et al.  Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Ting-Yun Chang,et al.  TinyGAN: Distilling BigGAN for Conditional Image Generation , 2020, ACCV.

[12]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[14]  Marko Viitanen,et al.  UVG dataset: 50/120fps 4K sequences for video codec analysis and development , 2020, MMSys.

[15]  Jungwon Lee,et al.  Variable Rate Deep Image Compression With a Conditional Autoencoder , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Eirikur Agustsson,et al.  High-Fidelity Generative Image Compression , 2020, NeurIPS.

[17]  Luc Van Gool,et al.  Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[19]  Taco S. Cohen,et al.  Overfitting for Fun and Profit: Instance-Adaptive Data Compression , 2021, ICLR.

[20]  R. Manmatha,et al.  Saliency Driven Perceptual Image Compression , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[21]  Taco Cohen,et al.  Adversarial Distortion for Learned Video Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Li Chen,et al.  Content Adaptive and Error Propagation Aware Deep Video Compression , 2020, ECCV.

[23]  Valero Laparra,et al.  On the relation between statistical learning and perceptual distances , 2021, ArXiv.

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  David Minnen,et al.  Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[26]  David Minnen,et al.  Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xiaoyun Zhang,et al.  DVC: An End-To-End Deep Video Compression Framework , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.