Robust Learning of 2-D Separable Transforms for Next-Generation Video Coding

With the simplicity of its application together with compression efficiency, the Discrete Cosine Transform(DCT) plays a vital role in the development of video compression standards. For next-generation video coding, a new set of 2-D separable transforms has emerged as a candidate to replace the DCT. These separable transforms are learned from residuals of each intra prediction mode, hence termed as Mode dependent-directional transforms (MDDT). MDDT uses the Karhunen-Loeve Transform (KLT) to create sets of separable transforms from training data. Since the residuals after intra prediction have some structural similarities, transforms utilizing these correlations improve coding efficiency. However, the KLT is the optimal approach only if the data has a Gaussian distribution without outliers. Due to the nature of the least-square norm, outliers can arbitrarily affect the directions of the KLT components. In this paper, we will address robust learning of separable transforms by enforcing sparsity on the coefficients of the representations. With this new approach, it is possible to improve upon the video coding performance of H.264/AVC by up to 10.2% BD-rate for intra coding. At no additional cost, the proposed techniques can also provide up to 3.9% improvement in BD-rate for intra coding compared to existing MDDT schemes.

[1]  Mourad Ouaret,et al.  On comparing JPEG2000 and intraframe AVC , 2006, SPIE Optics + Photonics.

[2]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[3]  T. Ebrahimi,et al.  A comparative study of JPEG2000, AVC/H.264, and HD photo , 2007, SPIE Optical Engineering + Applications.

[4]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[5]  Marta Karczewicz,et al.  Improved h.264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning , 2008, 2008 15th IEEE International Conference on Image Processing.

[6]  Onur G. Guleryuz,et al.  Sparse orthonormal transforms for image compression , 2008, 2008 15th IEEE International Conference on Image Processing.

[7]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[8]  Joel Solé,et al.  Joint sparsity-based optimization of a set of orthonormal 2-D separable block transforms , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[9]  Michael J. Black,et al.  Eigentracking: Robust matching and tracking of objects using view - based representation , 1998 .

[10]  Harry Shum,et al.  Principal Component Analysis with Missing Data and Its Application to Polyhedral Object Modeling , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Yücel Altunbasak,et al.  A sparsity-distortion-optimized multiscale representation of geometry , 2010, 2010 IEEE International Conference on Image Processing.

[12]  Alan L. Yuille,et al.  Robust principal component analysis by self-organizing rules based on statistical physics approach , 1995, IEEE Trans. Neural Networks.

[13]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .