Measuring Bitrate and Quality Trade-Off in a Fast Region-of-Interest Based Video Coding

Prevailing video adaptation solutions change the quality of the video uniformly throughout the whole frame in the bitrate adjustment process; while region-of-interest (ROI)-based solutions selectively retains the quality in the areas of the frame where the viewers are more likely to pay more attention to. ROI-based coding can improve perceptual quality and viewer satisfaction while trading off some bandwidth. However, there has been no comprehensive study to measure the bitrate vs. perceptual quality trade-off so far. The paper proposes an ROI detection scheme for videos, which is characterized with low computational complexity and robustness, and measures the bitrate vs. quality trade-off for ROI-based encoding using a state-of-the-art H.264/AVC encoder to justify the viability of this type of encoding method. The results from the subjective quality test reveal that ROI-based encoding achieves a significant perceptual quality improvement over the encoding with uniform quality at the cost of slightly more bits. Based on the bitrate measurements and subjective quality assessments, the bitrate and the perceptual quality estimation models for non-scalable ROI-based video coding (AVC) are developed, which are found to be similar to the models for scalable video coding (SVC).

[1]  Yao Wang,et al.  Modeling rate and perceptual quality of scalable video as functions of quantization and frame rate and its application in scalable video adaptation , 2009, 2009 17th International Packet Video Workshop.

[2]  F. James Statistical Methods in Experimental Physics , 1973 .

[3]  Wei Ding,et al.  Rate control of MPEG video coding and recording by rate-quantization modeling , 1996, IEEE Trans. Circuits Syst. Video Technol..

[4]  Zygmunt Pizlo,et al.  Camera Motion-Based Analysis of User Generated Video , 2010, IEEE Transactions on Multimedia.

[5]  ITU-T Rec. P.910 (04/2008) Subjective video quality assessment methods for multimedia applications , 2009 .

[6]  Franc Solina,et al.  15 seconds of fame: an interactive, computer-vision based art installation , 2004, MULTIMEDIA '04.

[7]  Ajay Luthra,et al.  The H.264/AVC Advanced Video Coding standard: overview and introduction to the fidelity range extensions , 2004, SPIE Optics + Photonics.

[8]  Ismail Khalil Ibrahim Handbook of Research on Mobile Multimedia , 2006 .

[9]  Gabriel-Miro Muntean,et al.  Objective Assessment of Region of Interest-Aware Adaptive Multimedia Streaming Quality , 2009, IEEE Transactions on Broadcasting.

[10]  Wei Song,et al.  Bitrate modeling of scalable videos using quantization parameter, frame rate and spatial resolution , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Thomas S. Huang,et al.  Image processing , 1971 .

[12]  Franc Solina,et al.  An Automatic Human Face Detection Method , 1999 .

[13]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[14]  Chia-Hung Yeh,et al.  Region-of-interest video coding based on rate and distortion variations for H.263+ , 2008, Signal Process. Image Commun..

[15]  Stephen R. Gulliver,et al.  Stars in their eyes: what eye-tracking reveals about multimedia perceptual quality , 2004, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[16]  Ashraf M. A. Ahmad Content-Based Video Streaming Approaches and Challenges , 2006 .

[17]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Franc Solina,et al.  15 seconds of fame - an interactive, computer-vision based art installation , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..