论文信息 - Generative Adversarial Networks for Extreme Learned Image Compression

Generative Adversarial Networks for Extreme Learned Image Compression

We present a learned image compression system based on GANs, operating at extremely low bitrates. Our proposed framework combines an encoder, decoder/generator and a multi-scale discriminator, which we train jointly for a generative learned compression objective. The model synthesizes details it cannot afford to store, obtaining visually pleasing results at bitrates where previous methods fail and show strong artifacts. Furthermore, if a semantic label map of the original image is available, our method can fully synthesize unimportant regions in the decoded image such as streets and trees from the label map, proportionally reducing the storage cost. A user study confirms that for low bitrates, our approach is preferred to state-of-the-art methods, even when they use more than double the bits.

L. Gool | M. Tschannen | R. Timofte | E. Agustsson | Fabian Mentzer

[1] David Minnen,et al. Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[2] Luc Van Gool,et al. Towards Image Understanding from Deep Compression without Decoding , 2018, ICLR.

[3] David Minnen,et al. Variational image compression with a scale hyperprior , 2018, ICLR.

[4] Luc Van Gool,et al. Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6] Y. Blau,et al. The Perception-Distortion Tradeoff , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[8] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[10] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[11] Alberto Del Bimbo,et al. Deep Generative Adversarial Compression Artifact Removal , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[13] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[14] Mu Li,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[16] Nir Shavit,et al. Generative Compression , 2017, 2018 Picture Coding Symposium (PCS).

[17] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[18] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[19] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[20] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[21] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[22] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[25] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[26] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[29] Valero Laparra,et al. End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).

[30] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[31] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[32] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[34] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[36] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[37] Giulia Boato,et al. RAISE: a raw images dataset for digital image forensics , 2015, MMSys.

[38] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[39] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[41] Aaron C. Courville,et al. Generative Adversarial Nets , 2014, NIPS.

[42] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[43] Michael W. Marcellin,et al. JPEG2000 - image compression fundamentals, standards and practice , 2013, The Kluwer international series in engineering and computer science.

[44] Santanu Chaudhury,et al. Visual saliency guided video compression algorithm , 2013, Signal Process. Image Commun..

[45] Stella X. Yu,et al. Image Compression Based on Visual Saliency at Individual Scales , 2009, ISVC.

[46] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[47] Liming Zhang,et al. A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[48] Thomas M. Cover,et al. Elements of Information Theory , 2005 .