Generating the Depth Map from the Motion Information of H.264-Encoded 2D Video Sequence

An efficient method that estimates the depth map of a 3D-scene using the motion information of the H.264-encoded 2D-video is presented. The motion information of the video-frames captured via a single camera is either directly used or modified to approximate the displacement (disparity) that exists between the right and left images when the scene is captured by stereoscopic cameras. Then, depth is estimated based on its inverse relation with disparity. The low-complexity of this method and its compatibility with future broadcasting networks allow its real-time implementation at the receiver; thus 3D-signal is constructed at no additional burden to the network. Performance evaluations show that this method outperforms the other existing H.264-based technique by up to 1.98 dB PSNR, providing more realistic depth information of the scene. Moreover subjective comparisons of the results, obtained by viewers watching the generated stereo video sequences on a 3D-display system, confirm the superiority of our method.

[1]  David C. Burr,et al.  How does binocular delay give information about depth? , 1979, Vision Research.

[2]  Shang-Hong Lai,et al.  A Generalized Depth Estimation Algorithm with a Single Image , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Kenji Taima,et al.  New television with 2D/3D image conversion technologies , 1996, Electronic Imaging.

[4]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Daniel Scharstein,et al.  View Synthesis Using Stereo Vision , 2001, Lecture Notes in Computer Science.

[6]  Philip Victor Harman,et al.  Rapid 2D-to-3D conversion , 2002, IS&T/SPIE Electronic Imaging.

[7]  P. Harman,et al.  Rapid 2 D to 3 D Conversion , 2002 .

[8]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[9]  C. Fehn A 3D-TV system based on video plus depth information , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[10]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[11]  Filippo Speranza,et al.  Depth image based rendering for multiview stereoscopic displays: role of information at object boundaries , 2005, SPIE Optics East.

[12]  Liang Zhang,et al.  Stereoscopic image generation based on depth images for 3D TV , 2005, IEEE Transactions on Broadcasting.

[13]  Oliver Schreer,et al.  3D Videocommunication: Algorithms, concepts and real-time systems in human centred communication , 2005 .

[14]  C. Pulfrich,et al.  Die Stereoskopie im Dienste der isochromen und heterochromen Photometrie , 2005, Naturwissenschaften.

[15]  Kan-Wei Fan,et al.  A novel architecture for converting single 2D image into 3D effect image , 2005, 2005 9th International Workshop on Cellular Neural Networks and Their Applications.

[16]  J. Ferreira,et al.  Stereoscopic image rendering based on depth maps created from blur and edge information , 2005, IS&T/SPIE Electronic Imaging.

[17]  Byoungho Lee,et al.  Stereoscopic conversion of two-dimensional movie encoded in MPEG-2 , 2006, SPIE Optics + Photonics.

[18]  Liang-Gee Chen,et al.  Depth Map Generation for 2D-to-3D Conversion by Short-Term Motion Assisted Color Segmentation , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[19]  Kwanghoon Sohn,et al.  Stereoscopic Video Generation Method using Motion Analysis , 2007, 2007 3DTV Conference.

[20]  Sandro Moiron,et al.  AVC to MPEG-2 Video Transcoding Architecture , 2007 .

[21]  Barak Fishbain,et al.  Real-time 2D to 3D video conversion , 2007, Journal of Real-Time Image Processing.

[22]  Antonio Ortega,et al.  Fast H.264 Mode Selection Using Depth Information for Distributed Game Viewing , 2008 .

[23]  Rabab Kreidieh Ward,et al.  Converting H.264-Derived Motion Information into Depth Map , 2009, MMM.