Progressive Dictionary Learning With Hierarchical Predictive Structure for Low Bit-Rate Scalable Video Coding

Dictionary learning has emerged as a promising alternative to the conventional hybrid coding framework. However, the rigid structure of sequential training and prediction degrades its performance in scalable video coding. This paper proposes a progressive dictionary learning framework with hierarchical predictive structure for scalable video coding, especially in low bitrate region. For pyramidal layers, sparse representation based on spatio-temporal dictionary is adopted to improve the coding efficiency of enhancement layers with a guarantee of reconstruction performance. The overcomplete dictionary is trained to adaptively capture local structures along motion trajectories as well as exploit the correlations between the neighboring layers of resolutions. Furthermore, progressive dictionary learning is developed to enable the scalability in temporal domain and restrict the error propagation in a closed-loop predictor. Under the hierarchical predictive structure, online learning is leveraged to guarantee the training and prediction performance with an improved convergence rate. To accommodate with the state-of-the-art scalable extension of H.264/AVC and latest High Efficiency Video Coding (HEVC), standardized codec cores are utilized to encode the base and enhancement layers. Experimental results show that the proposed method outperforms the latest scalable extension of HEVC and HEVC simulcast over extensive test sequences with various resolutions.

[1]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[2]  Chang Wen Chen,et al.  Sparse Spatio-Temporal Representation With Adaptive Regularized Dictionary Learning for Low Bit-Rate Video Coding , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[4]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[5]  Peter Schelkens,et al.  Complete-to-overcomplete discrete wavelet transforms: theory and applications , 2005, IEEE Transactions on Signal Processing.

[6]  Jerome M. Shapiro,et al.  Embedded image coding using zerotrees of wavelet coefficients , 1993, IEEE Trans. Signal Process..

[7]  Hongkai Xiong,et al.  Sparse Representation With Spatio-Temporal Online Dictionary Learning for Promising Video Coding. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[8]  Antti Hallapuro,et al.  High Performance, Low Complexity Video Coding and the Emerging HEVC Standard , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Yongdong Zhang,et al.  High Efficiency Video Coding: High Efficiency Video Coding , 2014 .

[10]  Hyun Wook Park,et al.  Motion estimation using low-band-shift method for wavelet-based moving-picture coding , 2000, IEEE Trans. Image Process..

[11]  Moncef Gabbouj,et al.  Sparse/DCT (S/DCT) Two-Layered Representation of Prediction Residuals for Video Coding , 2013, IEEE Transactions on Image Processing.

[12]  Xin Li,et al.  Scalable video compression via overcomplete motion compensated wavelet coding , 2004, Signal Process. Image Commun..

[13]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[14]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..

[15]  Heiko Schwarz,et al.  Analysis of Hierarchical B Pictures and MCTF , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[16]  Gary J. Sullivan,et al.  Spatial Scalability Within the H.264/AVC Scalable Video Coding Extension , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Feng Wu,et al.  Barbell-Lifting Based 3-D Wavelet Coding Scheme , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Xiaoyan Sun,et al.  Spatially Scalable Video Coding for HEVC , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[19]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[20]  Mihaela van der Schaar,et al.  In-band motion compensated temporal filtering , 2004, Signal Process. Image Commun..

[21]  Feng Wu,et al.  In-Scale Motion Compensation for Spatially Scalable Video Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Xiangjun Zhang,et al.  Improvement of H.264 SVC by model-based adaptive resolution upconversion , 2010, 2010 IEEE International Conference on Image Processing.

[23]  Heiko Schwarz,et al.  Overview of the Scalable Video Coding Extension of the H.264/AVC Standard , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Hasan F. Ates,et al.  Decoder-Side Super-Resolution and Frame Interpolation for Improved H.264 Video Coding , 2013, 2013 Data Compression Conference.

[25]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[26]  Zixiang Xiong,et al.  Three-dimensional embedded subband coding with optimized truncation , 2001 .

[27]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[28]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[29]  Kenneth Rose,et al.  An Estimation-Theoretic Framework for Spatially Scalable Video Coding , 2014, IEEE Transactions on Image Processing.

[30]  Aidong Men,et al.  Adaptive inter-layer intra prediction in scalable video coding , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[31]  Chih-Wei Huang,et al.  Adaptive Downsampling Video Coding With Spatially Scalable Rate-Distortion Modeling , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Pierre Vandergheynst,et al.  An improved pyramid for spatially scalable video coding , 2005, IEEE International Conference on Image Processing 2005.

[33]  Zixiang Xiong,et al.  Low bit-rate scalable video coding with 3-D set partitioning in hierarchical trees (3-D SPIHT) , 2000, IEEE Trans. Circuits Syst. Video Technol..

[34]  Ci Wang,et al.  Down-Sampling Based Video Coding Using Super-Resolution Technique , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Jianle Chen,et al.  Overview of SHVC: Scalable Extensions of the High Efficiency Video Coding Standard , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[36]  Mathias Wien,et al.  Real-Time System for Adaptive Video Streaming Based on SVC , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Thomas Wiegand,et al.  Mobile Video Transmission Using Scalable Video Coding , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  John W. Woods,et al.  Motion-compensated 3-D subband coding of video , 1999, IEEE Trans. Image Process..

[39]  Truong Q. Nguyen,et al.  Bidirectional scalable motion for scalable video coding , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[40]  David S. Taubman,et al.  Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression , 2003, IEEE Trans. Image Process..

[41]  Jens-Rainer Ohm,et al.  Three-dimensional subband coding with motion compensation , 1994, IEEE Trans. Image Process..

[42]  Yasuhiro Fujiwara,et al.  Fast Lasso Algorithm via Selective Coordinate Descent , 2016, AAAI.

[43]  Michael T. Orchard,et al.  Overlapped block motion compensation: an estimation-theoretic approach , 1994, IEEE Trans. Image Process..

[44]  Nanning Zheng,et al.  Image hallucination with primal sketch priors , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[45]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[46]  John W. Woods,et al.  Bidirectional MC-EZBC with lifting implementation , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  Nariman Farvardin,et al.  Three-dimensional subband coding of video , 1995, IEEE Trans. Image Process..

[48]  David S. Taubman,et al.  High performance scalable image compression with EBCOT , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).