Reference frame selection for loss-resilient texture & depth map coding in multiview video conferencing

In a free-viewpoint video conferencing system, the viewer can choose any desired viewpoint of the 3D scene for observation. Rendering of images for arbitrarily chosen viewpoint can be achieved through depth-image-based rendering (DIBR), which typically employs “texture-plus-depth” video format for 3D data exchange. Robust and timely transmission of multiple texture and depth maps over bandwidth-constrained and loss-prone networks is a challenging problem. In this paper, we optimize transmission of multiview video in texture-plus-depth format over a lossy channel for free viewpoint synthesis at decoder. In particular, we construct a recursive model to estimate the distortion in synthesized view due to errors in both texture and depth maps, and formulate a rate-distortion optimization problem to select reference pictures for macroblock encoding in H.264 in a computation-efficient way, in order to provide unequal protection to different macroblocks. Results show that the proposed scheme can outperform random insertion of intra refresh blocks by up to 0.73 dB at 5% loss.

[1]  Antonio Ortega,et al.  Depth map distortion analysis for view rendering and depth coding , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[2]  Feng Wu,et al.  Channel Distortion Modeling for Multi-View Video Transmission Over Packet-Switched Networks , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Klaus Hopf,et al.  Key technologies for an advanced 3D TV system , 2004, SPIE Optics East.

[4]  Leonard McMillan,et al.  Post-rendering 3D warping , 1997, SI3D.

[5]  Bruno Macchiavello,et al.  Reference frame selection for loss-resilient depth map coding in multiview video conferencing , 2012, Other Conferences.

[6]  Masayuki Tanimoto,et al.  Multiview Imaging and 3DTV , 2007, IEEE Signal Processing Magazine.

[7]  Yo-Sung Ho,et al.  H.264-Based Depth Map Sequence Coding Using Motion Information of Corresponding Texture Video , 2006, PSIVT.

[8]  Aljoscha Smolic,et al.  Multi-View Video Plus Depth Representation and Coding , 2007, 2007 IEEE International Conference on Image Processing.

[9]  Yao Wang,et al.  Error control and concealment for video communication: a review , 1998, Proc. IEEE.

[10]  Antonio Ortega,et al.  Transform domain sparsification of depth maps using iterative quadratic programming , 2011, 2011 18th IEEE International Conference on Image Processing.

[11]  Zhaozheng Yin,et al.  Improving depth perception with motion parallax and its application in teleconferencing , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[12]  Gene Cheung,et al.  Reference Frame Optimization for Multiple-Path Video Streaming With Complexity Scaling , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Toshiaki Fujii,et al.  Free-Viewpoint TV , 2011, IEEE Signal Processing Magazine.

[14]  Rui Zhang,et al.  Video coding with optimal inter/intra-mode switching for packet loss resilience , 2000, IEEE Journal on Selected Areas in Communications.