Interactive multiview video system with low decoding complexity

Research in multimedia is always investigating new ways of improving the immersive experience of the users. One current solution consists in designing systems which offer a high level of interactivity, such as multiview content navigation where the point of view can be changed while watching at a video sequence (e.g., free viewpoint television, gaming, etc.). The coding algorithm designed for the transmission of such media streams must be adapted to these novel decoder needs. However, video plus depth data transmission is usually performed by considering the information flows as two sequences encoded with MVC schemes. Whereas it achieves good compression performance, this coding approach is not appropriate for interactive applications since the decoding of a frame often requires the prior transmission and decoding of several reference frames. Moreover, the techniques recently developed to improve interactivity are generally implemented at the decoder, whose computational complexity requirements are augmented. In this paper, we propose a novel coding scheme for video plus depth sequences that is adapted to user navigation; contrarily to several common approaches, the additional complexity is added on the encoder side so that the decoder stays simple. We further propose to limit the additional bandwidth imposed by interactivity requirements by designing a rate allocation algorithm that builds on a model of the user behavior. A first version of our novel coding architecture is evaluated in terms of rate-distortion performance, where it is shown to offer a high interactivity at a reasonable bandwidth cost.

[1]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[2]  Antonio Ortega,et al.  Interactive Streaming of Stored Multiview Video Using Redundant Frame Structures , 2011, IEEE Transactions on Image Processing.

[3]  Jiang Li,et al.  A real-time interactive multi-view video system , 2005, MULTIMEDIA '05.

[4]  Alin Achim,et al.  18th IEEE International Conference on Image Processing, ICIP 2011, Brussels, Belgium, September 11-14, 2011 , 2011, ICIP.

[5]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[6]  Aljoscha Smolic,et al.  Interactive 3-D Video Representation and Coding Technologies , 2005, Proceedings of the IEEE.

[7]  T. Wiegand,et al.  Stereo video compression for mobile 3D services , 2009, 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[8]  Christian Weigel,et al.  Interactive free viewpoint video from multiple stereo , 2009, 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[9]  Y. Wang,et al.  Video plus depth compression for mobile 3D services , 2009, 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[10]  Toshiaki Fujii,et al.  Free-Viewpoint TV , 2011, IEEE Signal Processing Magazine.

[11]  Thomas Wiegand,et al.  3-D Video Representation Using Depth Maps , 2011, Proceedings of the IEEE.

[12]  Thomas Wiegand,et al.  Temporally consistent handling of disocclusions with texture synthesis for depth-image-based rendering , 2010, 2010 IEEE International Conference on Image Processing.

[13]  Yo-Sung Ho,et al.  Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-D video , 2009, 2009 Picture Coding Symposium.