On media data structures for interactive streaming in immersive applications

Interactive media streaming is the communication paradigm where an observer periodically requests new desired subsets from the streaming sender in real-time, upon which the sender sends the appropriate media data, corresponding to the received requests, for immediate decoding and display. This is in contrast to non-interactive media streaming, e.g., TV broadcast, where the entire media set is compressed and delivered to the observer before the observer interacts with the data (such as switching TV channels). Examples of interactive streaming abound in different media modalities: interactive browsing of JPEG2000 images, interactive light field or multiview video streaming, etc. Interactive media streaming has the obvious advantage of bandwidth efficiency: only the media subsets corresponding to observer's requests are transmitted. This is important when an observer only views a small subset out of a very large media data set during a typical streaming session. The technical challenge is how to structure media data such that good compression efficiency can be achieved by exploiting correlation among media subsets (thus inducing a particular decoding order if correlation is exploited during encoding), while providing sufficient flexibility for the observer to freely navigate the media data set in his/her desired unique order. In this overview paper, we survey different proposals in the literature that simultaneously achieve the conflicting objectives of compression efficiency and decoding flexibility.

[1]  Jiang Li,et al.  A real-time interactive multi-view video system , 2005, MULTIMEDIA '05.

[2]  Toshiaki Fujii,et al.  Multipoint Measuring System for Video and Sound - 100-camera and microphone system , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[3]  Aous Thabit Naman,et al.  A Novel Paradigm for Optimized Scalable Video Transmission Based on JPEG2000 with Motion , 2007, 2007 IEEE International Conference on Image Processing.

[4]  S. B. Kang,et al.  Survey of image-based representations and compression techniques , 2003, IEEE Trans. Circuits Syst. Video Technol..

[5]  Mark A. Horowitz,et al.  Light field video camera , 2000, IS&T/SPIE Electronic Imaging.

[6]  Antonio Ortega,et al.  Video compression with flexible playback order based on distributed source coding , 2006, Electronic Imaging.

[7]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[8]  Jian-Guang Lou,et al.  A RealTime Interactive MultiView Video System , 2005 .

[9]  Christophe De Vleeschouwer,et al.  A Flexible Video Transmission System Based on JPEG 2000 Conditional Replenishment with Multiple References , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  Eckehard G. Steinbach,et al.  RDTC Optimized Compression of Image-Based Scene Representations (Part II): Practical Coding , 2008, IEEE Transactions on Image Processing.

[11]  Bernd Girod,et al.  Background extraction and long-term memory motion-compensated prediction for spatial-random-access-enabled video coding , 2009, 2009 Picture Coding Symposium.

[12]  Aljoscha Smolic,et al.  Efficient Prediction Structures for Multiview Video Coding , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Markus Flierl,et al.  Motion and Disparity Compensated Coding for Multiview Video , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Ngai-Man Cheung,et al.  Generation of redundant frame structure for interactive multiview streaming , 2009, 2009 17th International Packet Video Workshop.

[15]  Bernd Girod,et al.  Wyner-Ziv coding of light fields for random access , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[16]  Eckehard G. Steinbach,et al.  RDTC Optimized Compression of Image-Based Scene Representations (Part I): Modeling and Theoretical Analysis , 2008, IEEE Transactions on Image Processing.

[17]  A. Murat Tekalp,et al.  Client-Driven Selective Streaming of Multiview Video for Interactive 3DTV , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Bernd Girod,et al.  Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality , 2007, 2007 15th European Signal Processing Conference.

[19]  Marta Karczewicz,et al.  The SP- and SI-frames design for H.264/AVC , 2003, IEEE Trans. Circuits Syst. Video Technol..

[20]  Marc Levoy The Digital Michelangelo Project , 1999, Comput. Graph. Forum.

[21]  Antonio Ortega,et al.  Distributed source coding techniques for interactive multiview video streaming , 2009, 2009 Picture Coding Symposium.

[22]  Narendra Ahuja,et al.  Compression of lightfield rendered images using coset codes , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[23]  Bernd Girod,et al.  Random access for compressed light fields using multiple representations , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[24]  Antonio Ortega,et al.  Optimized frame structure using distributed source coding for interactive multiview video streaming , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[25]  Hideaki Kimata,et al.  View Scalable Multiview Video Coding Using 3-D Warping With Depth Map , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Toshiaki Fujii,et al.  Free viewpoint TV system based on ray-space representation , 2002, SPIE ITCom.

[27]  David S. Taubman,et al.  Rate-distortion optimized interactive browsing of JPEG2000 images , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[28]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[29]  U. Reimers,et al.  Digital video broadcasting , 1998, IEEE Commun. Mag..