Depth perceptual region-of-interest based multiview video coding

MultiView Video (MVV) has attracted considerable attention recently since it is capable of providing users with three-dimensional perception and interactive functionalities. However, these MVV data require large mount of storage and bandwidth in network transmission. In this paper, we present a novel Depth Perceptual Region-Of-Interest (DP-ROI) based Multiview Video Coding (RMVC) scheme to extensively improve data compression efficiency by exploiting redundancies in depth perception. Firstly, we define DP-ROI according to the three-dimensional depth sensation of human visual system. Then, a framework of RMVC is developed to improve compression efficiency by properly segmenting the MVV into different macroblock wise DP-ROIs and encoding them separately. And then, we propose three fast depth based DP-ROI extraction and tracking algorithms by jointly using motion, texture, depth as well as previous extracted DP-ROIs. Finally, on the basis of the extracted DP-ROI, bit allocation optimization model is proposed to allocate more bits on DP-ROIs for high image quality and fewer bits on background regions for high compression ratio. Experimental results show that the presented RMVC scheme achieves significant coding gains at high rate while comparing with original joint multiview video model. To be specific, up to 14.22-23.32% bit-rate are saved while 0.16-0.68dB coding gains are achieved in DP-ROIs at the cost of the image quality degradation in background.

[1]  Thomas Wiegand,et al.  Compressing Time-Varying Visual Content , 2007, IEEE Signal Processing Magazine.

[2]  King Ngi Ngan,et al.  Unsupervised extraction of visual attention objects in color images , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Aljoscha Smolic,et al.  Interactive 3-D Video Representation and Coding Technologies , 2005, Proceedings of the IEEE.

[4]  Ebroul Izquierdo Disparity/segmentation analysis: matching with an adaptive window and depth-driven segmentation , 1999, IEEE Trans. Circuits Syst. Video Technol..

[5]  Aljoscha Smolic,et al.  Toward a 3D video format for auto-stereoscopic displays , 2008, Optical Engineering + Applications.

[6]  Chia-Hung Yeh,et al.  Region-of-interest video coding based on rate and distortion variations for H.263+ , 2008, Signal Process. Image Commun..

[7]  Hans-Jürgen Zepernick,et al.  Regional attention to structural degradations for perceptual image quality metric design , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[9]  Mei Yu,et al.  Adaptive Multiview Video Coding Scheme Based on Spatiotemporal Correlation Analyses , 2009 .

[10]  Aljoscha Smolic,et al.  Efficient Prediction Structures for Multiview Video Coding , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Ofer Hadar,et al.  Dynamic Computational Complexity and Bit Allocation for Optimizing H.264/AVC Video Compression , 2006, 2006 International Conference on Information Technology: Research and Education.

[12]  Xin Li,et al.  Contour-based object tracking with occlusion handling in video acquired using mobile cameras , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Masayuki Tanimoto Overview of free viewpoint television , 2006, Signal Process. Image Commun..

[14]  Zhengguo Li,et al.  Region-of-Interest Based Resource Allocation for Conversational Video Communication of H.264/AVC , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Sehoon Yea,et al.  View synthesis prediction for multiview video coding , 2009, Signal Process. Image Commun..

[16]  Toshiaki Fujii,et al.  Multi-View Video Coding using View Interpolation and Reference Picture Selection , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[17]  Yasuhiro Takishima,et al.  A study on rate distortion optimization scheme for JVT coder , 2003, Visual Communications and Image Processing.

[18]  Yang Wang,et al.  Spatiotemporal video segmentation based on graphical models , 2005, IEEE Transactions on Image Processing.

[19]  Yendo Tomohiro,et al.  Dynamic Ray-Space Coding using Multi-directional Picture , 2004 .

[20]  Yo-Sung Ho,et al.  Overview of Multi-view Video Coding , 2007, 2007 14th International Workshop on Systems, Signals and Image Processing and 6th EURASIP Conference focused on Speech and Image Processing, Multimedia Communications and Services.

[21]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  N. Atzpadin,et al.  Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability , 2007, Signal Process. Image Commun..

[23]  Weisi Lin,et al.  Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation , 2005, IEEE Transactions on Image Processing.

[24]  Mutsumi Ohta,et al.  Focused object extraction with multiple cameras , 2000, IEEE Trans. Circuits Syst. Video Technol..

[25]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.