论文信息 - Efficient Variable Rate Image Compression With Multi-Scale Decomposition Network

Efficient Variable Rate Image Compression With Multi-Scale Decomposition Network

While deep learning image compression methods have shown an impressive coding performance, most of them output a single-optimized-compression rate using a trained-specific network. However, in practice, it is essential to support the variable rate compression or meet a target rate with a high-coding performance. This paper proposes a novel image compression method, making it possible for a single convolutional neural network (CNN) model to generate the variable rate efficiently with an optimized rate-distortion (RD) performance. The method consists of CNN-based multi-scale decomposition transform and content adaptive rate allocation. Specifically, the transform network is learned to decompose the input image into several scales of representations while optimizing the RD performance for all scales. Rate allocation algorithms for two typical scenarios are provided to determine the optimal scale of each image block for a given target rate or quality factor. For a target rate, the allocation is adaptive based on content complexity. In addition, for a target quality factor which indicates a tradeoff between the rate and the quality, the optimal scale is determined by minimizing the RD cost. The experimental results have shown that our method has outperformed the JPEG2000 and BPG standards with high efficiency and the state-of-the-art RD performance as measured by the multi-scale structural similarity index metric. Moreover, our method can strictly control the rate to generate the target compression result.

[1] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[2] David Zhang,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Valero Laparra,et al. Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5] P. Wintz. Transform picture coding , 1972 .

[6] Michel Barlaud,et al. Image coding using wavelet transform , 1992, IEEE Trans. Image Process..

[7] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[8] David L. Neuhoff,et al. Quantization , 2022, IEEE Trans. Inf. Theory.

[9] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[11] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[12] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[13] Lei Zhou,et al. Variational Autoencoder for Low Bit-rate Image Compression , 2018, CVPR Workshops.

[14] Michael W. Marcellin,et al. JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] J. Jiang,et al. Image compression with neural networks - A survey , 1999, Signal Process. Image Commun..

[17] Stéphane Mallat,et al. A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[18] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[19] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[21] Valero Laparra,et al. End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).

[22] Gregory K. Wallace,et al. The JPEG still picture compression standard , 1992 .

[23] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[24] Dong Xu,et al. Deep Kalman Filtering Network for Video Compression Artifact Reduction , 2018, ECCV.

[25] David Minnen,et al. Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Te-Won Lee,et al. Modeling Nonlinear Dependencies in Natural Images using Mixture of Laplacian Distribution , 2004, NIPS.

[27] Vivek K. Goyal,et al. Theoretical foundations of transform coding , 2001, IEEE Signal Process. Mag..

[28] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[29] Glen G. Langdon,et al. Universal modeling and coding , 1981, IEEE Trans. Inf. Theory.

[30] Wuzhen Shi,et al. An End-to-End Compression Framework Based on Convolutional Neural Networks , 2017, 2017 Data Compression Conference (DCC).

[31] Mu Li,et al. Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression , 2018, ArXiv.