Learning Content-Weighted Deep Image Compression

Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance, and requires to cope with the spatial variation of image content and contextual dependence among learned codes. Traditional entropy models can spatially adapt the local bit rate based on the image content, but usually are limited in exploiting context in code space. On the other hand, most deep context models are computationally very expensive and cannot efficiently perform decoding over the symbols in parallel. In this paper, we present a content-weighted encoder-decoder model, where the channel-wise multi-valued quantization is deployed for the discretization of the encoder features, and an importance map subnet is introduced to generate the importance masks for spatially varying code pruning. Consequently, the summation of importance masks can serve as an upper bound of the length of bitstream. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose an upper-triangular masked convolutional network (triuMCN) for large context modeling. Experiments show that the proposed method can produce visually much better results, and performs favorably against deep and traditional lossy image compression approaches.

[1]  Andrea Giachetti,et al.  TESTIMAGES: A Large Data Archive For Display and Algorithm Testing , 2013, J. Graph. Tools.

[2]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[3]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[6]  David Zhang,et al.  Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[10]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[11]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  A. Said Introduction to Arithmetic Coding - Theory and Practice , 2023, ArXiv.

[13]  Luc Van Gool,et al.  Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Luc Van Gool,et al.  Extreme Learned Image Compression with GANs , 2018, CVPR Workshops.

[15]  Luca Benini,et al.  Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.

[16]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[17]  David Zhang,et al.  Enlarging Context with Low Cost: Efficient Arithmetic Coding with Trimmed Convolution , 2018 .

[18]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[19]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[20]  Touradj Ebrahimi,et al.  The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[21]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[22]  David Minnen,et al.  Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Susu Yao,et al.  Just noticeable distortion model and its applications in video coding , 2005, Signal Process. Image Commun..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Xiaoou Tang,et al.  Compression Artifacts Reduction by a Deep Convolutional Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[28]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[29]  Nicola Asuni,et al.  TESTIMAGES: a Large-scale Archive for Testing Visual Devices and Basic Image Processing Algorithms , 2014, STAG.

[30]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[31]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[32]  Mu Li,et al.  Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression , 2018, ArXiv.

[33]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Lubomir D. Bourdev,et al.  Real-Time Adaptive Image Compression , 2017, ICML.

[36]  David Minnen,et al.  Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[37]  David Minnen,et al.  Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1992 .

[39]  Heiko Schwarz,et al.  Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[40]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[41]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.