Enhanced Multiple Transform for Video Coding

The Discrete Cosine Transform (DCT), and in particular the DCT type II, has been widely used for image and video compression. Although DCT efficiently approximates the optimal Karhunen-Loève transform under first-order Markov conditions with low complexity, the energy packing efficiency is still limited since a fixed transform cannot always capture the highly dynamic statistics of natural video content. In this paper, to further improve the transform efficiency, an Enhanced Multiple Transform (EMT) scheme is proposed. In the proposed EMT, a few sinusoidal transforms, other than DCT, have also been utilized for coding both Intra and Inter prediction residuals. The best transform, as selected from a pre-defined transform subset specified by prediction mode, is explicitly signaled in a joint coding block level manner. Moreover, to accelerate encoding process, fast methods have also been proposed by skipping unnecessary transform rate-distortion evaluations using previously encoding statistics. The proposed method has been implemented on top of High-Efficiency Video Coding (HEVC) reference software, and significant coding gain has been verified.

[1]  Mark J. T. Smith,et al.  A filter bank for the directional decomposition of images: theory and design , 1992, IEEE Trans. Signal Process..

[2]  Bing Zeng,et al.  Directional Discrete Cosine Transforms—A New Framework for Image Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Yun He,et al.  Singular vector decomposition based adaptive transform for motion compensation residuals , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[4]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Stephen A. Martucci,et al.  Symmetric convolution and the discrete sine and cosine transforms , 1993, IEEE Trans. Signal Process..

[6]  Ankur Saxena,et al.  On secondary transforms for prediction residual , 2012, 2012 19th IEEE International Conference on Image Processing.

[7]  Wen Gao,et al.  Rate-distortion optimized transform for intra-frame coding , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  E. Candès,et al.  Ridgelets: a key to higher-dimensional intermittency? , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[9]  R. Clarke,et al.  Relation between the Karhunen Loève and cosine transforms , 1981 .

[10]  Wen Gao,et al.  Video Coding With Rate-Distortion Optimized Transform , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Minh N. Do,et al.  Ieee Transactions on Image Processing the Contourlet Transform: an Efficient Directional Multiresolution Image Representation , 2022 .

[12]  Ying Chen,et al.  Coding tools investigation for next generation video coding based on HEVC , 2015, SPIE Optical Engineering + Applications.

[13]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[14]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[15]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[16]  Kari Karhunen,et al.  Über lineare Methoden in der Wahrscheinlichkeitsrechnung , 1947 .

[17]  Guangming Shi,et al.  Exploiting Non-Local Correlation via Signal-Dependent Transform (SDT) , 2011, IEEE Journal of Selected Topics in Signal Processing.

[18]  Marta Karczewicz,et al.  Improved h.264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning , 2008, 2008 15th IEEE International Conference on Image Processing.

[19]  Emmanuel J. Candès,et al.  The curvelet transform for image denoising , 2002, IEEE Trans. Image Process..

[20]  Anil K. Jain,et al.  A Sinusoidal Family of Unitary Transforms , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[22]  Kenneth Rose,et al.  Jointly Optimized Spatial Prediction and Block Transform for Video and Image Coding , 2012, IEEE Transactions on Image Processing.