Asymptotic closed-loop design for transform domain temporal prediction

Current video coders exploit temporal dependencies via prediction that consists of motion-compensated pixel copying operations. Such per-pixel temporal prediction ignores important underlying spatial correlations, as well as considerable variations in temporal correlation across frequency components. In the transform domain, however, spatial decorrelation is first achieved, allowing for the true temporal correlation at each frequency to emerge and be properly accounted for, with particular impact at high frequencies, whose lower correlation is otherwise masked by the dominant low frequencies. This paper focuses on effective design of transform domain temporal prediction that: i) fully accounts for the effects of sub-pixel interpolation filters, and ii) circumvents the challenge of catastrophic design instability due to quantization error propagation through the prediction loop. We design predictors conditioned on frequency and sub-pixel position, employing an iterative open-loop (hence stable) design procedure that, on convergence, approximates closed-loop operation. Experimental results validate the effectiveness of both the asymptotic closed-loop design procedure and the transform-domain temporal prediction paradigm, with significant and consistent performance gains over the standard.

[1]  Eric Dubois,et al.  Bayesian Estimation of Motion Vector Fields , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Kenneth Rose,et al.  The asymptotic closed-loop approach to predictive vector quantizer design with application in video coding , 2001, IEEE Trans. Image Process..

[3]  Anthony G. Constantinides,et al.  Variable size block matching motion compensation with applications to video coding , 1990 .

[4]  Oscar C. Au,et al.  Predictive motion vector field adaptive search technique (PMVFAST): enhancing block-based motion estimation , 2000, IS&T/SPIE Electronic Imaging.

[5]  Jungwoo Lee,et al.  Optimal quadtree for variable block size motion estimation , 1995, Proceedings., International Conference on Image Processing.

[6]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[7]  Jens-Rainer Ohm,et al.  Three-dimensional subband coding with motion compensation , 1994, IEEE Trans. Image Process..

[8]  John W. Woods,et al.  Spatio-temporal adaptive 3-D Kalman filter for video , 1997, IEEE Trans. Image Process..

[9]  Kenneth Rose,et al.  Transform-domain temporal prediction in video coding: Exploiting correlation variation across coefficients , 2010, 2010 IEEE International Conference on Image Processing.

[10]  Kenneth Rose,et al.  Transform-domain temporal prediction in video coding with spatially adaptive spectral correlations , 2011, 2011 IEEE 13th International Workshop on Multimedia Signal Processing.

[11]  Shunyao Li,et al.  Reduced-rank condensed filter dictionaries for inter-picture prediction , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Thomas Wedi Adaptive interpolation filter for motion and aliasing compensated prediction , 2002, IS&T/SPIE Electronic Imaging.

[13]  Lurng-Kuo Liu,et al.  A block-based gradient descent search algorithm for block motion estimation in video coding , 1996, IEEE Trans. Circuits Syst. Video Technol..

[14]  John W. Woods,et al.  Motion-compensated 3-D subband coding of video , 1999, IEEE Trans. Image Process..

[15]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.