Depth-Based 3D Video Formats and Coding Technology

The primary usage scenario for 3D video (3DV) formats is to support depth perception of a visual scene as provided by a 3D display system. There are many types of 3D display systems including classic stereoscopic systems that require special-purpose glasses to more sophisticated multiview autostereoscopic displays that do not require glasses (Konrad and Halle, 2007). While stereoscopic systems only require two views, the multiview displays have much higher data throughput requirements since 3D is achieved by essentially emitting multiple videos in order to form view-dependent pictures. Such displays can be implemented, for example, using conventional high-resolution displays and parallax barriers; other technologies include lenticular overlay sheets and holographic screens. Each viewdependent video sample can be thought of as emitting a small number of light rays in a set of discrete viewing directions–typically between eight and a few dozen for an auto-stereoscopic display. Often these directions are distributed in a horizontal plane, such that parallax effects are limited to the horizontal motion of the observer. A more comprehensive review of 3D display technologies is given in Chapter 15, as well as by Benzie et al (2007). An overview can also be found in Ozaktas and Onural (2007). Emerging Technologies for 3D Video: Creation, Coding, Transmission and Rendering This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c ©Mitsubishi Electric Research Laboratories, Inc., 2013 201 Broadway, Cambridge, Massachusetts 02139

[1]  Gary J. Sullivan,et al.  Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard , 2011, Proceedings of the IEEE.

[2]  Pascal Frossard,et al.  Sparse stereo image coding with learned dictionaries , 2011, 2011 18th IEEE International Conference on Image Processing.

[3]  Luc Van Gool,et al.  Advanced three-dimensional television system technologies , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[4]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[5]  M. Halle,et al.  3-D Displays and Signal Processing , 2007, IEEE Signal Processing Magazine.

[6]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.

[7]  Margrit Gelautz,et al.  A layered stereo matching algorithm using image segmentation and global visibility constraints , 2005 .

[8]  Heiko Schwarz,et al.  Motion vector inheritance for high efficiency 3D video plus depth coding , 2012, 2012 Picture Coding Symposium.

[9]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Oliver Schreer,et al.  Stereo analysis by hybrid recursive matching for real-time immersive video conferencing , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Aljoscha Smolic,et al.  View Synthesis for Advanced 3D Video Systems , 2008, EURASIP J. Image Video Process..

[13]  Anthony Vetro,et al.  Temporally consistent stereo matching using coherence function , 2010, 2010 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[14]  Heiko Schwarz,et al.  Synthesized View Distortion Based 3D Video Coding for Extrapolation and Interpolation of Views , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[15]  Aljoscha Smolic,et al.  Efficient Prediction Structures for Multiview Video Coding , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[17]  Aljoscha Smolic,et al.  The effects of multiview depth video compression on multiview rendering , 2009, Signal Process. Image Commun..

[18]  Tao Chen,et al.  3D-TV Content Storage and Transmission , 2011, IEEE Transactions on Broadcasting.

[19]  Antonio Ortega,et al.  Depth map coding with distortion estimation of rendered view , 2010, Electronic Imaging.

[20]  Markus H. Gross,et al.  3D video fragments: dynamic point samples for real-time free-viewpoint video , 2004, Comput. Graph..

[21]  Cevahir Çigla,et al.  Region-Based Dense Depth Extraction from Multi-View Video , 2007, ICIP.

[22]  Detlev Marpe,et al.  3D video: Depth coding based on inter-component prediction of block partitions , 2012, 2012 Picture Coding Symposium.

[23]  Ismo Rakkolainen,et al.  A Survey of 3DTV Displays: Techniques and Technologies , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Itu-T and Iso Iec Jtc Advanced video coding for generic audiovisual services , 2010 .

[25]  N. Atzpadin,et al.  Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability , 2007, Signal Process. Image Commun..

[26]  Thomas Wiegand,et al.  3-D Video Representation Using Depth Maps , 2011, Proceedings of the IEEE.

[27]  Sehoon Yea,et al.  View synthesis prediction for multiview video coding , 2009, Signal Process. Image Commun..

[28]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[29]  Donald P. Brutzman,et al.  The virtual reality modeling language and Java , 1998, CACM.

[30]  Yo-Sung Ho,et al.  Three-dimensional video generation using foreground separation and disocclusion detection , 2010, 2010 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[31]  Yo-Sung Ho,et al.  Depth Reconstruction Filter and Down/Up Sampling for Depth Coding in 3-D Video , 2009, IEEE Signal Processing Letters.

[32]  Anthony Vetro,et al.  View Synthesis for Multiview Video Compression , 2006 .

[33]  Yo-Sung Ho,et al.  View-consistent multi-view depth estimation for three-dimensional video generation , 2010, 2010 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video.