3D motion estimation for 3D video coding

H.264/MVC multi-view video coding provides a better compression rate compared to the simulcast coding using hierarchical B-picture prediction structure exploiting inter- and intra-view redundancy. However, this technique imposes random access frame delay as well as requiring huge computational time. In this paper a novel technique is proposed using 3D motion estimation (3D-ME) to overcome the problems. In the 3D-ME technique, a 3D frame is formed using the same temporal frames of all views and ME is carried out for the current 3D frame using the immediate previous 3D frame as a reference frame. As the correlation among the intra-view images is higher compared to the correlation among the inter-view images, the proposed 3D-ME technique reduces the overall computational time and eliminates the frame delay with comparable rate-distortion (RD) performance compared to H.264/MVC. Another technique is also proposed in the paper where an extra reference 3D frame comprising dynamic background frames (the most common frame of a scene i.e., McFIS) of each view is used for 3D-ME. Experimental results reveal that the proposed 3D-ME-McFIS technique outperforms the H.264/MVC in terms of improved RD performance by reducing computational time and by eliminating the random access frame delay.

[1]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[2]  Jie Zhao,et al.  McFIS in hierarchical bipredictve pictures-based video coding for referencing the stable area in a scene , 2011, ICIP 2011.

[3]  Dar-Shyang Lee,et al.  Effective Gaussian mixture learning for video background subtraction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Gary J. Sullivan,et al.  Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard , 2011, Proceedings of the IEEE.

[5]  Bu-Sung Lee,et al.  Explore and Model Better I-Frames for Video Coding , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Michael R. Frater,et al.  An Efficient Mode Selection Prior to the Actual Encoding for H.264/AVC Encoder , 2009, IEEE Transactions on Multimedia.

[7]  Wen Gao,et al.  Fast disparity and motion estimation based on correlations for multiview video coding , 2008, IEEE Transactions on Consumer Electronics.

[8]  LeeDar-Shyang Effective Gaussian Mixture Learning for Video Background Subtraction , 2005 .

[9]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[10]  Mahsa Talebpourazad,et al.  3D-TV Content generation and multi-view video coding , 2010 .

[11]  Bu-Sung Lee,et al.  McFIS in hierarchical bipredictve pictures-based video coding for referencing the stable area in a scene , 2011, 2011 18th IEEE International Conference on Image Processing.

[12]  Gary J. Sullivan,et al.  Rate-constrained coder control and comparison of video coding standards , 2003, IEEE Trans. Circuits Syst. Video Technol..