论文信息 - Deep Image Compression in the Wavelet Transform Domain Based on High Frequency Sub-Band Prediction

Deep Image Compression in the Wavelet Transform Domain Based on High Frequency Sub-Band Prediction

In this paper, we propose to use deep neural networks for image compression in the wavelet transform domain. When the input image is transformed from the spatial pixel domain to the wavelet transform domain, one low-frequency sub-band (LF sub-band) and three high-frequency sub-bands (HF sub-bands) are generated. Low-frequency sub-band is firstly used to predict each high-frequency sub-band to eliminate redundancy between the sub-bands, after which the sub-bands are fed into different auto-encoders to do the encoding. In order to further improve the compression efficiency, we use a conditional probability model to estimate the context-dependent prior probability of the encoded codes, which can be used for entropy coding. The entire training process is unsupervised, and the auto-encoders and the conditional probability model are trained jointly. The experimental results show that the proposed approach outperforms JPEG, JPEG2000, BPG, and some mainstream neural network-based image compression. Furthermore, it produces better visual quality with clearer details and textures because more high-frequency coefficients can be reserved, thanks to the high-frequency prediction.

[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2] Luc Van Gool,et al. Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Yuexiang Li,et al. cC-GAN: A Robust Transfer-Learning Framework for HEp-2 Specimen Image Segmentation , 2018, IEEE Access.

[4] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9] Radomir S. Stankovic,et al. The Haar wavelet transform: its status and achievements , 2003, Comput. Electr. Eng..

[10] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[11] Nikolas P. Galatsanos,et al. Regularized reconstruction to reduce blocking artifacts of block discrete cosine transform compressed images , 1993, IEEE Trans. Circuits Syst. Video Technol..

[12] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[13] N. Ahmed,et al. Discrete Cosine Transform , 1996 .

[14] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[15] Wenning Xu,et al. Active Learning for Visual Image Classification Method Based on Transfer Learning , 2018, IEEE Access.

[16] Wuzhen Shi,et al. An End-to-End Compression Framework Based on Convolutional Neural Networks , 2017, 2017 Data Compression Conference (DCC).

[17] Touradj Ebrahimi,et al. The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..

[18] Geoffrey E. Hinton,et al. An Efficient Learning Procedure for Deep Boltzmann Machines , 2012, Neural Computation.

[19] Xiaojie Wang,et al. Multiple Features With Extreme Learning Machines For Clothing Image Recognition , 2018, IEEE Access.

[20] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[21] David Minnen,et al. Target-Quality Image Compression with Recurrent, Convolutional Neural Networks , 2017, ArXiv.

[22] Nasser M. Nasrabadi,et al. Coupled Auto-Associative Neural Networks for Heterogeneous Face Recognition , 2015, IEEE Access.

[23] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Gregory K. Wallace,et al. The JPEG still picture compression standard , 1992 .

[25] Antonio Artés,et al. Multi-iteration wavelet zero-tree coding for image compression , 2000, Signal Process..

[26] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[28] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[29] Wei Li,et al. Diverse Region-Based CNN for Hyperspectral Image Classification , 2018, IEEE Transactions on Image Processing.

[30] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[31] David Zhang,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[33] Adrian S. Lewis,et al. Image compression using the 2-D wavelet transform , 1992, IEEE Trans. Image Process..

[34] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Valero Laparra,et al. End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).