论文信息 - Image Coding With Data-Driven Transforms: Methodology, Performance and Potential

Image Coding With Data-Driven Transforms: Methodology, Performance and Potential

Image compression has always been an important topic in the last decades due to the explosive increase of images. The popular image compression formats are based on different transforms which convert images from the spatial domain into compact frequency domain to remove the spatial correlation. In this paper, we focus on the exploration of data-driven transform, Karhunen-Loéve transform (KLT), the kernels of which are derived from specific images via Principal Component Analysis (PCA), and design a high efficient KLT based image compression algorithm with variable transform sizes. To explore the optimal compression performance, the multiple transform sizes and categories are utilized and determined adaptively according to their rate-distortion (RD) costs. Moreover, comprehensive analyses on the transform coefficients are provided and a band-adaptive quantization scheme is proposed based on the coefficient RD performance. Extensive experiments are performed on several class-specific images as well as general images, and the proposed method achieves significant coding gain over the popular image compression standards including JPEG, JPEG 2000, and the state-of-the-art dictionary learning based methods.

[1] Ajay Luthra,et al. Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[2] Jianhua Lu,et al. Compressibility Constrained Sparse Representation With Learnt Dictionary for Low Bit-Rate Image Compression , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[3] Gregory K. Wallace,et al. The JPEG still picture compression standard , 1991, CACM.

[4] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[6] Bing Zeng,et al. Directional Discrete Cosine Transforms—A New Framework for Image Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[7] Hao Wang,et al. Dictionary learning-based image compression , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[8] Ingrid Daubechies,et al. The wavelet transform, time-frequency localization and signal analysis , 1990, IEEE Trans. Inf. Theory.

[9] Matthias Kramm. Compression of image clusters using Karhunen Loeve transformations , 2007, Electronic Imaging.

[10] Henrique S. Malvar. Lapped biorthogonal transforms for transform coding with reduced blocking and ringing artifacts , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] Wen Gao,et al. Prior-Based Quantization Bin Matching for Cloud Storage of JPEG Images , 2016, IEEE Transactions on Image Processing.

[12] C.-C. Jay Kuo,et al. On Data-Driven Saak Transform , 2017, J. Vis. Commun. Image Represent..

[13] Wen Gao,et al. Just-Noticeable Difference-Based Perceptual Optimization for JPEG Compression , 2017, IEEE Signal Processing Letters.

[14] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[15] Michael W. Marcellin,et al. An overview of JPEG-2000 , 2000, Proceedings DCC 2000. Data Compression Conference.

[16] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[17] David S. Taubman,et al. High performance scalable image compression with EBCOT , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[18] Wen Gao,et al. Fine-Grained Quality Assessment for Compressed Images , 2019, IEEE Transactions on Image Processing.

[19] Xinfeng Zhang,et al. Compressed Image Quality Assessment Based on Saak Features , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[20] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21] Simone Bianco,et al. Large Age-Gap face verification by feature injection in deep networks , 2016, Pattern Recognit. Lett..

[22] Jianqin Zhou,et al. On discrete cosine transform , 2011, ArXiv.

[23] Alan C. Bovik,et al. Objective quality assessment of multiply distorted images , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[24] C.-C. Jay Kuo,et al. A Saak Transform Approach to Efficient, Scalable and Robust Handwritten Digits Recognition , 2017, 2018 Picture Coding Symposium (PCS).

[25] Jana Reinhard,et al. Textures A Photographic Album For Artists And Designers , 2016 .

[26] Narendra Ahuja,et al. Phase PCA for Dynamic Texture Video Compression , 2007, 2007 IEEE International Conference on Image Processing.

[27] Wen Gao,et al. Reduced-Reference Quality Assessment of Screen Content Images , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[28] Antonio Ortega,et al. GTT: Graph template transforms with applications to image coding , 2015, 2015 Picture Coding Symposium (PCS).

[29] Henry Stark,et al. Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[30] Jingning Han,et al. DSSLIC: Deep Semantic Segmentation-based Layered Image Compression , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31] S. Miyake,et al. Image data compression using a neural network model , 1989, International 1989 Joint Conference on Neural Networks.

[32] Kjersti Engan,et al. Image compression using learned dictionaries by RLS-DLA and compared with K-SVD , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[33] Valero Laparra,et al. End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).

[34] S. O. Aase,et al. IMPROVED HUFFMAN CODING USING RECURSIVE SPLITTING , 2000 .

[35] Luc Van Gool,et al. Extreme Learned Image Compression with GANs , 2018, CVPR Workshops.

[36] Wen Gao,et al. Rate-distortion based sparse coding for image set compression , 2015, 2015 Visual Communications and Image Processing (VCIP).

[37] Gerald Schaefer,et al. UCID: an uncompressed color image database , 2003, IS&T/SPIE Electronic Imaging.

[38] Susanto Rahardja,et al. Mode-Dependent Transforms for Coding Directional Intra Prediction Residuals , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[39] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[40] R. Lynn Kirlin,et al. Adaptive image compression using Karhunen-Loeve transform , 1990, Signal Process..

[41] Miron Livny,et al. An efficient algorithm for optimizing DCT quantization , 2000, IEEE Trans. Image Process..

[42] Zhou Wang,et al. Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[43] Feng Wu,et al. Directional Lapped Transforms for Image Coding , 2008, IEEE Transactions on Image Processing.

[44] L. Prina Ricotti,et al. Neural clustering for optimal KLT image compression , 1993, IEEE Trans. Signal Process..

[45] Yong Xu,et al. A Projective Invariant for Textures , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[46] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[47] Ming-Chang Wu,et al. A unified systolic array for discrete cosine and sine transforms , 1991, IEEE Trans. Signal Process..

[48] Savita S. Jadhav,et al. JPEG XR an Image Coding Standard , 2012 .

[49] Wen Gao,et al. The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[50] F. Dufaux,et al. The JPEG XR image coding standard [Standards in a Nutshell] , 2009, IEEE Signal Processing Magazine.

[51] Wen Gao,et al. Video Coding With Rate-Distortion Optimized Transform , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[52] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.

[53] Avideh Zakhor,et al. Dictionary approximation for matching pursuit video coding , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[54] Anil K. Jain,et al. Handbook of Fingerprint Recognition , 2005, Springer Professional Computing.

[55] Daan Wierstra,et al. Towards Conceptual Compression , 2016, NIPS.

[56] Jianle Chen,et al. Joint Separable and Non-Separable Transforms for Next-Generation Video Coding , 2018, IEEE Transactions on Image Processing.

[57] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[58] G. Bjontegaard,et al. Calculation of Average PSNR Differences between RD-curves , 2001 .

[59] Wen Gao,et al. Rate-Distortion Optimized Sparse Coding With Ordered Dictionary for Image Set Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.