Quality Assessment of Deep-Learning-Based Image Compression

Image compression standards rely on predictive coding, transform coding, quantization and entropy coding, in order to achieve high compression performance. Very recently, deep generative models have been used to optimize or replace some of these operations, with very promising results. However, so far no systematic and independent study of the coding performance of these algorithms has been carried out. In this paper, for the first time, we conduct a subjective evaluation of two recent deep-learning-based image compression algorithms, comparing them to JPEG 2000 and to the recent BPG image codec based on HEVC Intra. We found that compression approaches based on deep auto-encoders can achieve coding performance higher than JPEG 2000, and sometimes as good as BPG. We also show experimentally that the PSNR metric is to be avoided when evaluating the visual quality of deep-learning-based methods, as their artifacts have different characteristics from those of DCT or wavelet-based codecs. In particular, images compressed at low bitrate appear more natural than JPEG 2000 coded pictures, according to a no-reference naturalness measure. Our study indicates that deep generative models are likely to bring huge innovation into the video coding arena in the coming years.

[1]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[2]  Luc Van Gool,et al.  Generative Adversarial Networks for Extreme Learned Image Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Andreas Schilling,et al.  Creating cinematic wide gamut HDR-video for the evaluation of tone mapping operators and HDR-displays , 2014 .

[4]  Nir Shavit,et al.  Generative Compression , 2017, 2018 Picture Coding Symposium (PCS).

[5]  Lubomir D. Bourdev,et al.  Real-Time Adaptive Image Compression , 2017, ICML.

[6]  S. Mallat A wavelet tour of signal processing , 1998 .

[7]  Luca Benini,et al.  Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[8]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[9]  Rafal Mantiuk,et al.  Display adaptive tone mapping , 2008, SIGGRAPH 2008.

[10]  Methods , metrics and procedures for statistical evaluation , qualification and comparison of objective quality prediction models , 2013 .

[11]  David Minnen,et al.  Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Mark D. Fairchild,et al.  The HDR Photographic Survey , 2007, CIC.

[13]  David Minnen,et al.  Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[15]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[16]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[17]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[18]  Frédéric Dufaux,et al.  A model of perceived dynamic range for HDR images , 2017, Signal Process. Image Commun..

[19]  David Minnen,et al.  Spatially adaptive image compression using a tiled deep network , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[20]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[21]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[22]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .