State-of-the-art video coding approaches: A survey

The past few decades have witnessed the explorative growth of video services around humans. However, due to the limited spectrum, video coding has to be developed for efficient video delivery. The core of video coding is compressing video frames by exploiting their redundancy. Aiming at decreasing spatio-temporal redundancy, several video coding standards have been proposed with a hybrid framework during the past two decades. We therefore first survey in this paper the existing standards on video coding. Second, we review some other video coding approaches, which take advantage of state-of-the-art computer vision and machine learning technologies, for lessening both spatio-temporal and perceptual redundancy of images/videos. In retrospect of what has been achieved so far, we finally outlook what the future may hold for video coding.

[1]  LiShipeng,et al.  Image Compression With Edge-Based Inpainting , 2007 .

[2]  Paolo Napoletano,et al.  Bayesian Integration of Face and Low-Level Cues for Foveated Video Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Christine Guillemot,et al.  Image Compression Using Sparse Representations and the Iteration-Tuned and Aligned Dictionary , 2011, IEEE Journal of Selected Topics in Signal Processing.

[4]  Jianhua Lu,et al.  Compressibility Constrained Sparse Representation With Learnt Dictionary for Low Bit-Rate Image Compression , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Heiko Schwarz,et al.  Improved H.264/AVC coding using texture analysis and synthesis , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[6]  Mårten Sjöström,et al.  Improved ROI video coding using variable Gaussian pre-filters and variance in intensity , 2005, IEEE International Conference on Image Processing 2005.

[7]  Patrick Pérez,et al.  Reconstructing an image from its local descriptors , 2011, CVPR 2011.

[8]  Xiaoyan Sun,et al.  Cloud-Based Image Coding for Mobile Devices—Toward Thousands to One Compression , 2013, IEEE Transactions on Multimedia.

[9]  Anastasis A. Sofokleous,et al.  Review: H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia , 2005, Comput. J..

[10]  Gary J. Sullivan,et al.  Comparison of the Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC) , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Yu Sun,et al.  Region-based rate control and bit allocation for wireless video transmission , 2006, IEEE Transactions on Multimedia.

[12]  Andrew B. Watson,et al.  Image Compression Using the Discrete Cosine Transform , 1994 .

[13]  Touradj Ebrahimi,et al.  Efficient video coding based on audio-visual focus of attention , 2011, J. Vis. Commun. Image Represent..

[14]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[15]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[16]  Chia-Hung Yeh,et al.  Region-of-interest video coding based on rate and distortion variations for H.263+ , 2008, Signal Process. Image Commun..

[17]  Heiko Schwarz,et al.  Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[18]  Chun-Jen Tsai,et al.  Visual sensitivity guided bit allocation for video coding , 2006, IEEE Transactions on Multimedia.

[19]  Stefanos D. Kollias,et al.  Low bit-rate coding of image sequences using adaptive regions of interest , 1998, IEEE Trans. Circuits Syst. Video Technol..

[20]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[21]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[22]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[23]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[24]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[25]  Meir Feder,et al.  Image compression via improved quadtree decomposition algorithms , 1994, IEEE Trans. Image Process..

[26]  Shengxi Li,et al.  Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face , 2014, IEEE Journal of Selected Topics in Signal Processing.

[27]  Barry G. Haskell,et al.  An encoder-decoder texture replacement method with application to content-based movie coding , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Jan-Michael Frahm,et al.  Cloud-scale Image Compression Through Content Deduplication , 2014, BMVC.

[29]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Zhou Wang,et al.  Embedded foveation image coding , 2001, IEEE Trans. Image Process..

[31]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[33]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[34]  John Watkinson The MPEG handbook : MPEG-1, MPEG-2, MPEG-4 , 2001 .

[35]  Yuan F. Zheng,et al.  A generic video coding framework based on anisotropic diffusion and spatio-temporal completion , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Laurent Itti,et al.  Visual attention guided bit allocation in video compression , 2011, Image Vis. Comput..

[37]  Chaminda T. E. R. Hewage,et al.  Flexible Macroblock Ordering for Context-Aware Ultrasound Video Transmission over Mobile WiMAX , 2010, International journal of telemedicine and applications.

[38]  Chih-Wei Tang,et al.  Spatiotemporal Visual Considerations for Video Coding , 2007, IEEE Transactions on Multimedia.

[39]  Didier Le Gall,et al.  MPEG: a video compression standard for multimedia applications , 1991, CACM.

[40]  Xiaoyan Sun,et al.  Multi-model prediction for image set compression , 2013, 2013 Visual Communications and Image Processing (VCIP).

[41]  Kjersti Engan,et al.  Image compression using learned dictionaries by RLS-DLA and compared with K-SVD , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Satoshi Goto,et al.  Region-of-interest based dynamical parameter allocation for H.264/AVC encoder , 2009, 2009 Picture Coding Symposium.

[43]  Oleg V. Komogortsev,et al.  Predictive real-time perceptual compression based on eye-gaze-position analysis , 2008, TOMCCAP.

[44]  Ali C. Begen,et al.  IPTV and video networks in the 2015 timeframe: The evolution to medianets , 2009, IEEE Communications Magazine.

[45]  C Blakemore,et al.  On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images , 1969, The Journal of physiology.

[46]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, SPIE Optics + Photonics.

[47]  Touradj Ebrahimi,et al.  Video coding based on audio-visual attention , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[48]  Jianhua Lu,et al.  Online Dictionary Learning Based Intra-frame Video Coding , 2012, The 15th International Symposium on Wireless Personal Multimedia Communications.

[49]  Chia-Hung Yeh,et al.  Robust Region-of-Interest Determination Based on User Attention Model Through Visual Rhythm Analysis , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[50]  Zhiwei Xiong,et al.  Block-Based Image Compression With Parameter-Assistant Inpainting , 2010, IEEE Transactions on Image Processing.

[51]  Zhengguo Li,et al.  Region-of-Interest Based Resource Allocation for Conversational Video Communication of H.264/AVC , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[52]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[53]  Xiaoyan Sun,et al.  Video coding with spatio-temporal texture synthesis and edge-based inpainting , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[54]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[55]  Yang Xu,et al.  Priority Belief Propagation-Based Inpainting Prediction With Tensor Voting Projected Structure in Video Compression , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  Touradj Ebrahimi,et al.  Semantic video analysis for adaptive content delivery and automatic description , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[57]  Ming Lei Liou,et al.  Overview of the p×64 kbit/s video coding standard , 1991, CACM.

[58]  Michael Elad,et al.  Compression of facial images using the K-SVD algorithm , 2008, J. Vis. Commun. Image Represent..

[59]  K. Rijkse,et al.  H.263: video coding for low-bit-rate communication , 1996, IEEE Commun. Mag..

[60]  Touradj Ebrahimi,et al.  Perceptual Video Compression: A Survey , 2012, IEEE Journal of Selected Topics in Signal Processing.