Interactive omnidirectional video delivery: A bandwidth-effective approach

Omnidirectional video (cylindrical or spherical) is a new media becoming more and more popular thanks to its interactivity for online multimedia applications such as Google Street View as well as for video surveillance and robotics applications. Interactivity in this context means that the user is able to explore and navigate audio-visual scenes by freely choosing viewpoint and viewing direction. In order to provide this key feature, omnidirectional video is typically represented as a classical two-dimensional (2D) rectangular panorama video that is mapped onto a (spherical or cylindrical) mesh and then rendered on the client's screen. Early transmission models of this full panorama video and mesh content simply deal with the panorama as a high-resolution video to be encoded at uniform quality. Generally the user can only view a restricted field-of-view of the content and then interact with pan-tilt-zoom commands. This means that a significant part of the bandwidth is wasted by transmitting quality video in regions that are not being visualized. In this paper we evaluate the relevance and optimality of a personalized transmission where quality is modulated in spherical or cylindrical regions depending on their likelihood to be viewed during a live user interaction. We show, based on interaction delay as well as bandwidth constraints, how tiling and predictive methods can improve on existing methods. © 2012 Alcatel-Lucent.

[1]  J. Snyder Flattening the Earth: Two Thousand Years of Map Projections , 1994 .

[2]  Faouzi Kossentini,et al.  JasPer: a software-based JPEG-2000 codec implementation , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[3]  Aljoscha Smolic,et al.  3DAV exploration of video-based rendering technology in MPEG , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Frank Nielsen,et al.  Surround video: a multihead camera approach , 2005, The Visual Computer.

[5]  Aljoscha Smolic,et al.  Interactive 3-D Video Representation and Coding Technologies , 2005, Proceedings of the IEEE.

[6]  Alla Sheffer,et al.  Mesh parameterization: theory and practice Video files associated with this course are available from the citation page , 2007, SIGGRAPH Courses.

[7]  Jens Vygen,et al.  The Book Review Column1 , 2020, SIGACT News.

[8]  Bruno Lévy,et al.  Mesh parameterization: theory and practice , 2007, SIGGRAPH Courses.

[9]  Peter Lambert,et al.  Real-time interactive regions of interest in H.264/AVC , 2008, Journal of Real-Time Image Processing.

[10]  Chao Xu,et al.  Comparison between JPEG2000 and H.264 for digital cinema , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[11]  E. Yilmaz,et al.  3D Scene Reconstruction Based on Robust Camera Motion Estimation and Space Sweeping for a Cultural Heritage Virtual Tour System , 2008, 2008 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[12]  X. Zabulis,et al.  3D RECONSTRUCTION FOR A CULTURAL HERITAGE VIRTUAL TOUR SYSTEM , 2008 .

[13]  Andrew Chi-Sing Leung,et al.  The Rhombic Dodecahedron Map: An Efficient Scheme for Encoding Panoramic Video , 2009, IEEE Transactions on Multimedia.

[14]  Bernd Girod,et al.  Background extraction and long-term memory motion-compensated prediction for spatial-random-access-enabled video coding , 2009, 2009 Picture Coding Symposium.

[15]  Christian Früh,et al.  Google Street View: Capturing the World at Street Level , 2010, Computer.