论文信息 - Scalar and Vector Quantization for Learned Image Compression: A Study on the Effects of MSE and GAN Loss in Various Spaces

Scalar and Vector Quantization for Learned Image Compression: A Study on the Effects of MSE and GAN Loss in Various Spaces

Recently, learned image compression by means of neural networks has experienced a performance boost by the use of adversarial loss functions. Typically, a generative adversarial network (GAN) is designed with the generator being an autoencoder with quantizer in the bottleneck for compression and reconstruction. It is well known from rate-distortion theory that vector quantizers provide lower quantization errors than scalar quantizers at the same bitrate. Still, learned image compression approaches often use scalar quantization instead. In this work we provide insights into the image reconstruction quality of the often-employed uniform scalar quantizers, non-uniform scalar quantizers, and the rarely employed but bitrate-efficient vector quantizers, all being integrated into backpropagation and operating under the exact same bitrate. Further interesting insights are obtained by our investigation of an MSE loss and a GAN loss. We show that vector quantization is always beneficial for the compression performance both in the latent space and the reconstructed image space. However, image samples demonstrate that the GAN loss produces the more pleasing reconstructed images, while the non-adversarial MSE loss provides better quality scores of various instrumental measures both in the latent space and on the reconstructed images.

[1] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[2] Herbert Gish,et al. Asymptotically efficient quantizing , 1968, IEEE Trans. Inf. Theory.

[3] Olgica Milenkovic,et al. Distortion-rate functions for quantized compressive sensing , 2009, 2009 IEEE Information Theory Workshop on Networking and Information Theory.

[4] Luc Van Gool,et al. Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] David Minnen,et al. Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[8] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[9] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[10] Tim Fingscheidt,et al. On Low-Bitrate Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[11] Jianmin Wang,et al. Deep Quantization Network for Efficient Image Retrieval , 2016, AAAI.

[12] Robert M. Gray,et al. Quantization noise spectra , 1990, IEEE Trans. Inf. Theory.

[13] David Salomon,et al. Data Compression: The Complete Reference , 2006 .

[14] Steve Branson,et al. Learned Video Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Xiaohua Zhai,et al. The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[16] Navdeep Jaitly,et al. Adversarial Autoencoders , 2015, ArXiv.

[17] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] David L. Neuhoff,et al. Quantization , 2022, IEEE Trans. Inf. Theory.

[19] Jack Xin,et al. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.

[20] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[21] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[22] Ganesh K. Venayagamoorthy,et al. Neural networks based non-uniform scalar quantizer design with particle swarm optimization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[23] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[24] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[25] Houqiang Li,et al. Quantization Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] David A. Huffman,et al. A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[27] Jens-Rainer Ohm,et al. Models for Static and Dynamic Texture Synthesis in Image and Video Compression , 2011, IEEE Journal of Selected Topics in Signal Processing.

[28] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[29] Allen Gersho,et al. Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[30] Aaron C. Courville,et al. Adversarially Learned Inference , 2016, ICLR.

[31] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.

[33] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[34] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[35] Zhou Wang,et al. Multi-scale structural similarity for image quality assessment , 2003 .

[36] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[37] Jorma Rissanen,et al. Generalized Kraft Inequality and Arithmetic Coding , 1976, IBM J. Res. Dev..

[38] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[39] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).