论文信息 - Toward Variable-Rate Generative Compression by Reducing the Channel Redundancy

Toward Variable-Rate Generative Compression by Reducing the Channel Redundancy

Compressing large images with a generative model goes beyond typical image encoding standards under a notably low bitrate. In this paper, we step toward practical generative compression systems based on recent advances. Specifically, we show that the channel redundancy of the latent representation produced by an autoencoder network can be effectively compressed via mask compression. The mask compression performs quantization on the channel variance of latent representation instead of original values. Instead of training multiple models, changing the mask leads to a simple and efficient variable rate compression scheme. Then, we estimate the relative bitrate by measuring the L1 norm of the channel variance and hence obtain the rate-distortion formulation. The L1 regularizer assumes a Laplacian prior on the channel variance, through which model we develop corresponding methods to produce approximate images at a target bitrate. This eliminates the need for manually searching hyperparameters for our variable-rate compression. We conduct exhaustive experiments to demonstrate the advanced performance of the proposed method in preserving image quality and semantics.

[1] Xuelong Li,et al. Lossless Data Embedding Using Generalized Statistical Quantity Histogram , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[2] Bin Li,et al. Fully Connected Network-Based Intra Prediction for Image Coding , 2018, IEEE Transactions on Image Processing.

[3] Gregory K. Wallace,et al. The JPEG still picture compression standard , 1991, CACM.

[4] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[5] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[6] Jian Wu,et al. Learned Iterative Decoding for Lossy Image Compression Systems , 2018, ArXiv.

[7] R. Tibshirani,et al. Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[8] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[9] David Minnen,et al. Spatially adaptive image compression using a tiled deep network , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[10] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[11] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[12] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[13] Luc Van Gool,et al. Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14] Dong Liu,et al. Deep Learning-Based Video Coding: A Review and A Case Study , 2019, ArXiv.

[15] Aline Roumy,et al. Autoencoder Based Image Compression: Can the Learning be Quantization Independent? , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16] Jianhua Lu,et al. Compressibility Constrained Sparse Representation With Learnt Dictionary for Low Bit-Rate Image Compression , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[17] Yu Liu,et al. CNN-based Prediction for Lossless Coding of Photographic Images , 2018, 2018 Picture Coding Symposium (PCS).

[18] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[19] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] David Zhang,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Yun Zhang,et al. Machine learning based video coding optimizations: A survey , 2020, Inf. Sci..

[22] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Feng Wu,et al. Learning for Video Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[24] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[25] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Nir Shavit,et al. Generative Compression , 2017, 2018 Picture Coding Symposium (PCS).

[27] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[28] Tilo Strutz. Multiplierless Reversible Color Transforms and Their Automatic Selection for Image Data Compression , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[29] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Gary J. Sullivan,et al. Efficient scalar quantization of exponential and Laplacian random variables , 1996, IEEE Trans. Inf. Theory.

[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33] Thomas Brox,et al. Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[34] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[35] David Minnen,et al. Variational image compression with a scale hyperprior , 2018, ICLR.

[36] Yochai Blau,et al. Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff , 2019, ICML.

[37] Xinfeng Zhang,et al. Image and Video Compression With Neural Networks: A Review , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[38] Yang Li,et al. Dictionary Learning for Image Coding Based on Multisample Sparse Representation , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[39] Tim Fingscheidt,et al. GAN- vs. JPEG2000 Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation , 2019, ArXiv.

[40] Yochai Blau,et al. The Perception-Distortion Tradeoff , 2017, CVPR.

[41] Daan Wierstra,et al. Towards Conceptual Compression , 2016, NIPS.

[42] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43] Wei Dai,et al. Deep learning techniques in video coding and quality analysis , 2018, Optical Engineering + Applications.

[44] Jiro Katto,et al. Deep Convolutional AutoEncoder-based Lossy Image Compression , 2018, 2018 Picture Coding Symposium (PCS).

[45] Wuzhen Shi,et al. An End-to-End Compression Framework Based on Convolutional Neural Networks , 2017, 2017 Data Compression Conference (DCC).

[46] Touradj Ebrahimi,et al. The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[47] Jiro Katto,et al. Performance Comparison of Convolutional AutoEncoders, Generative Adversarial Networks and Super-Resolution for Image Compression , 2018, CVPR Workshops.

[48] Sivaraman Balakrishnan,et al. Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[49] David Minnen,et al. Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[53] Luc Van Gool,et al. Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[54] Xiaofei He,et al. Image Compression by Learning to Minimize the Total Error , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[55] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[56] Tim Fingscheidt,et al. On Low-Bitrate Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[57] Wen Gao,et al. Rate-Distortion Optimized Sparse Coding With Ordered Dictionary for Image Set Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[58] Wen Gao,et al. Reducing Image Compression Artifacts by Structural Sparse Representation and Quantization Constraint Prior , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[59] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[60] Lei Zhou,et al. Variational Autoencoder for Low Bit-rate Image Compression , 2018, CVPR Workshops.

[61] Xiaoyun Zhang,et al. Efficient Variable Rate Image Compression With Multi-Scale Decomposition Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.