An Estimation-Theoretic Framework for Spatially Scalable Video Coding

This paper focuses on prediction optimality in spatially scalable video coding. It draws inspiration from an estimation-theoretic prediction framework for quality (SNR) scalability earlier developed by our group, which achieved optimality by fully accounting for relevant information from the current base layer (e.g., quantization intervals) and the enhancement layer, to efficiently calculate the conditional expectation that forms the optimal predictor. It was central to that approach that all layers reconstruct approximations to the same original transform coefficient. In spatial scalability, however, the layers encode different resolution versions of the signal. To approach optimality in enhancement layer prediction, this paper departs from existing spatially scalable codecs that employ pixel domain resampling to perform interlayer prediction. Instead, it incorporates a transform domain resampling technique that ensures that the base layer quantization intervals are accessible and usable at the enhancement layer despite their differing signal resolutions, which in conjunction with prior enhancement layer information, enable optimal prediction. A delayed prediction approach that complements this framework for spatial scalable video coding is then provided to further exploit future base layer frames for additional enhancement layer coding performance gains. Finally, a low-complexity variant of the proposed estimation-theoretic prediction approach is also devised, which approximates the conditional expectation by switching between three predictors depending on a simple condition involving information from both layers, and which retains significant performance gains. Simulations provide experimental evidence that the proposed approaches substantially outperform the standard scalable video codec and other leading competitors.

[1]  Benoît Macq,et al.  Block operations in digital signal processing with application to TV coding , 1987 .

[2]  Rong Zhang,et al.  Efficient Inter-Layer Motion Compensation for Spatially Scalable Video Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Kenneth Rose,et al.  An estimation-theoretic approach to spatially scalable video coding , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Hsueh-Ming Hang,et al.  Source model for transform video coder and its application. I. Fundamental theory , 1997, IEEE Trans. Circuits Syst. Video Technol..

[5]  King Ngi Ngan,et al.  A frequency scalable coding scheme employing pyramid and subband techniques , 1994, IEEE Trans. Circuits Syst. Video Technol..

[6]  André Kaup,et al.  Laplace Distribution Based Lagrangian Rate Distortion Optimization for Hybrid Video Coding , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Kenneth Rose,et al.  Toward optimality in scalable predictive coding , 2001, IEEE Trans. Image Process..

[8]  Kenneth Rose,et al.  Estimation-Theoretic Approach to Delayed Decoding of Predictively Encoded Video Sequences , 2013, IEEE Transactions on Image Processing.

[9]  Kenneth Rose,et al.  A unified framework for spectral domain prediction and end-to-end distortion estimation in scalable video coding , 2011, 2011 18th IEEE International Conference on Image Processing.

[10]  Gary J. Sullivan,et al.  Spatial Scalability Within the H.264/AVC Scalable Video Coding Extension , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  S.A. Martucci Image resizing in the discrete cosine transform domain , 1995, Proceedings., International Conference on Image Processing.

[12]  Aggelos K. Katsaggelos,et al.  Resampling for Spatial Scalability , 2006, 2006 International Conference on Image Processing.

[13]  Rong Zhang,et al.  Rate Distortion Analysis for Spatially Scalable Video Coding , 2010, IEEE Transactions on Image Processing.

[14]  Feng Wu,et al.  A framework for efficient progressive fine granularity scalable video coding , 2001, IEEE Trans. Circuits Syst. Video Technol..

[15]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Kenneth Rose,et al.  Estimation-theoretic approach to delayed prediction in scalable video coding , 2010, 2010 IEEE International Conference on Image Processing.

[17]  Gary J. Sullivan,et al.  Efficient scalar quantization of exponential and Laplacian random variables , 1996, IEEE Trans. Inf. Theory.

[18]  Kenneth Rose,et al.  Towards optimal scalability in predictive video coding , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[19]  Kenneth Rose,et al.  An estimation-theoretic framework for spatially scalable video coding with delayed prediction , 2012, 2012 19th International Packet Video Workshop (PV).

[20]  Fabio Bellifemine,et al.  Statistical analysis of the 2D-DCT coefficients of the differential signal for images , 1992, Signal Process. Image Commun..

[21]  Kenneth Rose,et al.  Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences , 2010, 2010 Data Compression Conference.

[22]  Heiko Schwarz,et al.  Overview of the Scalable Video Coding Extension of the H.264/AVC Standard , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Joseph W. Goodman,et al.  A mathematical analysis of the DCT coefficient distributions for images , 2000, IEEE Trans. Image Process..

[24]  Heiko Schwarz,et al.  Constrained inter-layer prediction for single-loop decoding in spatial scalability , 2005, IEEE International Conference on Image Processing 2005.