Video Error Concealment Using a Computation-Efficient Low Saliency Prior

Error concealment in packet-loss-corrupted streaming video is inherently an under-determined problem, as there are insufficient number of well-defined criteria to recover the missing blocks perfectly. When a Region-of-Interest (ROI) based unequal error protection (UEP) scheme is deployed during video streaming-i.e., more visually salient regions are strongly protected-a lost block is likely to be of low saliency in the original frame. In this paper, we propose to add a low-saliency prior to the error concealment problem as a regularization term. It serves two purposes. First, in ROI-based UEP video streaming, low-saliency prior provides the correct side information for the client to identify the correct replacement blocks for concealment. Second, in the event that a perfectly matched block cannot be unambiguously identified, the low-saliency prior reduces viewer's visual attention on the loss-stricken region, resulting in higher overall subjective quality. We study the effectiveness of a low-saliency prior in the context of a previously proposed RECAP error concealment system. RECAP transmits a low-resolution (LR) version of an image alongside the original high-resolution (HR) version, so that if blocks in the HR version are lost, the correctly-received LR version can serve as a template for matching of suitable replacement blocks from a previously correctly-decoded HR frame. We add a low-saliency prior to the block identification process, so that only replacement candidate blocks with good match and low saliency can be selected. Further, we develop a low-complexity convex approximation to the well known Itti-Koch-Niebur saliency model, which enables the low-saliency error concealment problem to be solved efficiently. Experimental results show that: i) PSNR of the error-concealed frames can be increased dramatically (up to 3.6 dB over the original RECAP), showing the effectiveness of a low-saliency prior in the under-determined error concealment problem; and ii) subjective quality of the repaired video using our proposal, as confirmed by an extensive user study, is better than the original RECAP.

[1]  Ivan V. Bajic,et al.  Evaluation of several visual saliency models in terms of gaze prediction accuracy on video , 2012, 2012 IEEE Globecom Workshops.

[2]  John W. Woods Multidimensional Signal, Image, and Video Processing and Coding, Second Edition , 2011 .

[3]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[4]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[5]  Jorge Herbert de Lira,et al.  Two-Dimensional Signal and Image Processing , 1989 .

[6]  Wei Chen,et al.  Region-of-Interest intra prediction for H.264/AVC error resilience , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[7]  Gene Cheung,et al.  Saliency-Cognizant Error Concealment in Loss-Corrupted Streaming Video , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[8]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[9]  Masayuki Tanimoto,et al.  Multiview Imaging and 3DTV , 2007, IEEE Signal Processing Magazine.

[10]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[11]  E. Gilbert Capacity of a burst-noise channel , 1960 .

[12]  Gabriel-Miro Muntean,et al.  Region of Interest-Based Adaptive Multimedia Streaming Scheme , 2008, IEEE Transactions on Broadcasting.

[13]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[14]  Christine Guillemot,et al.  Perceptually-Friendly H.264/AVC Video Coding Based on Foveated Just-Noticeable-Distortion Model , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[16]  Martin Reisslein,et al.  Objective Video Quality Assessment Methods: A Classification, Review, and Performance Comparison , 2011, IEEE Transactions on Broadcasting.

[17]  Richard Vuduc,et al.  Automatic performance tuning of sparse matrix kernels , 2003 .

[18]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[19]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[20]  Henry Stark,et al.  Probability, Statistics, and Random Processes for Engineers , 2011 .

[21]  Wei Chen,et al.  A new H.264/AVC error resilience model based on Regions of Interest , 2009, 2009 17th International Packet Video Workshop.

[22]  Chuohao Yeo,et al.  Receiver error concealment using acknowledge preview (RECAP) - An approach to resilient video streaming , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Oscar C. Au,et al.  Video Error Concealment Using Spatio-Temporal Boundary Matching and Partial Differential Equation , 2008, IEEE Transactions on Multimedia.

[24]  Gene Cheung,et al.  Complexity of Saliency-Cognizant Error Concealment Based on the Itti-Koch-Niebur Saliency Model , 2012 .

[25]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[26]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[27]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[28]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  John W. Woods,et al.  Multidimensional Signal, Image and Video Processing and Coding [Book Reviews] , 2007, IEEE Signal Processing Magazine.

[30]  M. M. Taylor,et al.  PEST: Efficient Estimates on Probability Functions , 1967 .

[31]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[32]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .

[33]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[34]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[35]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[36]  P. Vaidyanathan Multirate Systems And Filter Banks , 1992 .