Video Streaming with Interactive Pan/Tilt/Zoom

High-spatial-resolution videos offer the possibility of viewing an arbitrary region-of-interest (RoI) interactively. The user can pan/tilt/zoom while watching the video. This chapter presents spatial-random-access-enabled video compression that encodes the content such that arbitrary RoIs corresponding to different zoom factors can be extracted from the compressed bit-stream. The chapter also covers RoI trajectory prediction, which allows pre-fetching relevant content in a streaming scenario. The more accurate the prediction the lower is the percentage of missing pixels. RoI prediction techniques can perform better by adapting according to the video content in addition to simply extrapolating previous moves of the input device. Finally, the chapter presents a streaming system that employs application-layer peer-to-peer (P2P) multicast while still allowing the users to freely choose individual RoIs. The P2P overlay adapts on-the-fly for exploiting the commonalities in the peers’ RoIs. This enables peers to relay data to each other in real-time, thus drastically reducing the bandwidth required from dedicated servers.

[1]  Sachin Agarwal,et al.  Performance of P2P live video streaming systems on a controlled test-bed , 2008, TRIDENTCOM.

[2]  C Tomasi,et al.  Shape and motion from image streams: a factorization method. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Christophe De Vleeschouwer,et al.  A Flexible Video Transmission System Based on JPEG 2000 Conditional Replenishment with Multiple References , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Eckehard G. Steinbach,et al.  RDTC Optimized Compression of Image-Based Scene Representations (Part II): Practical Coding , 2008, IEEE Transactions on Image Processing.

[5]  David Meyer,et al.  IANA Guidelines for IPv4 Multicast Address Assignments , 2001, RFC.

[6]  Masayuki Tanimoto Free Viewpoint Television (FTV) , 2007 .

[7]  Markus Flierl,et al.  Motion and Disparity Compensated Coding for Multiview Video , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Robert Prandolini,et al.  Architecture, philosophy, and performance of JPIP: internet protocol standard for JPEG2000 , 2003, Visual Communications and Image Processing.

[9]  Bernd Girod,et al.  Efficiency analysis of multihypothesis motion-compensated prediction for video coding , 2000, IEEE Trans. Image Process..

[10]  Wolfgang Effelsberg,et al.  Robust background estimation for complex video sequences , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[11]  Eckehard G. Steinbach,et al.  RDTC Optimized Compression of Image-Based Scene Representations (Part I): Modeling and Theoretical Analysis , 2008, IEEE Transactions on Image Processing.

[12]  Harald Haas,et al.  Asilomar Conference on Signals, Systems, and Computers , 2006 .

[13]  Bernd Girod,et al.  Congestion-aware video streaming over peer-to-peer networks , 2006 .

[14]  Sachin Agarwal,et al.  Video quality assessment and comparative evaluation of peer-to-peer video streaming systems , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[15]  Ronald Azuma,et al.  A frequency-domain analysis of head-motion prediction , 1995, SIGGRAPH.

[16]  Bernd Girod,et al.  Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality , 2007, 2007 15th European Signal Processing Conference.

[17]  Dietmar Hepper,et al.  Efficiency analysis and application of uncovered background prediction in a low bit rate image coder , 1990, IEEE Trans. Commun..

[18]  David S. Taubman,et al.  Rate-distortion optimized interactive browsing of JPEG2000 images , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[19]  Walter Bender,et al.  Salient Stills: Process and Practice , 1996, IBM Syst. J..

[20]  Bernd Girod,et al.  Pre-fetching based on video analysis for interactive region-of-interest streaming of soccer sequences , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[21]  Bernd Girod,et al.  Content-Aware P2P Video Streaming with Low Latency , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[22]  Peter Eisert,et al.  Creation of High-Resolution Video Panoramas of Sport Events , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[23]  Jonathan D. Cohen,et al.  Los Angeles, CA, USA , 2002 .

[24]  Bernd Girod,et al.  Optimal server bandwidth allocation for streaming multiple streams via P2P multicast , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[25]  Toshiaki Fujii,et al.  Low-delay multiview video coding for free-viewpoint video communication , 2007, Systems and Computers in Japan.

[26]  T. Sikora,et al.  Feasibility of Multi-View Video Streaming Over P2P Networks , 2008, 2008 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[27]  Peter Eisert,et al.  Creation of High-Resolution Video Panoramas for Sport Events , 2007, Int. J. Semantic Comput..

[28]  Bernd Girod,et al.  Low Latency Video Streaming Over Peer-To-Peer Networks , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[29]  Aljoscha Smolic,et al.  3DAV exploration of video-based rendering technology in MPEG , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Sandeep K. Singhal,et al.  Exploiting Position History for Efficient Remote Rendering in Networked Virtual Reality , 1995, Presence: Teleoperators & Virtual Environments.

[31]  J. Bennett,et al.  Advanced video coding , 2003 .

[32]  Bernd Girod,et al.  Random access for compressed light fields using multiple representations , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[33]  Bernd Girod,et al.  Rate-Distortion Optimized Interactive Light Field Streaming , 2007, IEEE Transactions on Multimedia.

[34]  Bernd Girod,et al.  Congestion-Distortion Optimized Peer-to-Peer Video Streaming , 2006, 2006 International Conference on Image Processing.

[35]  Stephen E. Deering,et al.  Host extensions for IP multicasting , 1986, RFC.

[36]  Bernd Girod,et al.  Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations , 2004 .

[37]  M. Reha Civanlar,et al.  Interactive transport of multi-view videos for 3DTV applications , 2006 .

[38]  Reza Rejaie,et al.  Mesh or Multiple-Tree: A Comparative Study of Live P2P Streaming Approaches , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[39]  Bernd Girod,et al.  Network-Aware H . 264 / AVC Region-of-Interest Coding for a Multi-Camera Wireless Surveillance Network ⋆ , 2006 .

[40]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[41]  Thomas Wiegand,et al.  Long-term memory motion-compensated prediction , 1999, IEEE Trans. Circuits Syst. Video Technol..

[42]  Civanlar M. Reha,et al.  Interactive transport of multi-view videos for 3DTV applications , 2006 .

[43]  Steven McCanne,et al.  Receiver-driven layered multicast , 1996, SIGCOMM '96.

[44]  Thomas Wiegand,et al.  3D Video and Free Viewpoint Video - Technologies, Applications and MPEG Standards , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[45]  Bernd Girod,et al.  Feature Tracking for Mobile Augmented Reality Using Video Coder Motion Vectors , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[46]  Bernd Girod,et al.  Peer-to-Peer Live Multicast: A Video Perspective , 2008, Proceedings of the IEEE.

[47]  A. Murat Tekalp,et al.  A receiver-driven multicasting framework for 3DTV transmission , 2005, 2005 13th European Signal Processing Conference.

[48]  Bernd Girod,et al.  Rate-distortion optimized video peer-to-peer multicast streaming , 2005, P2PMMS'05.

[49]  Bernd Girod,et al.  Optimal server bandwidth allocation among multiple P2P multicast live video streaming sessions , 2009, 2009 17th International Packet Video Workshop.

[50]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[51]  Bernd Girod,et al.  Background extraction and long-term memory motion-compensated prediction for spatial-random-access-enabled video coding , 2009, 2009 Picture Coding Symposium.

[52]  Narendra Ahuja,et al.  Compression of lightfield rendered images using coset codes , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[53]  Bernd Girod,et al.  Peer-to-peer multicast live video streaming with interactive virtual pan/tilt/zoom functionality , 2008, 2008 15th IEEE International Conference on Image Processing.

[54]  Heiko Schwarz,et al.  Overview of the Scalable Video Coding Extension of the H.264/AVC Standard , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[55]  T. Wiegand,et al.  REPRESENTATION, CODING AND INTERACTIVE RENDERING OF HIGH- RESOLUTION PANORAMIC IMAGES AND VIDEO USING MPEG-4 , 2005 .

[56]  Bernd Girod,et al.  Motion-compensating prediction with fractional-pel accuracy , 1993, IEEE Trans. Commun..

[57]  Ngai-Man Cheung,et al.  Generation of redundant frame structure for interactive multiview streaming , 2009, 2009 17th International Packet Video Workshop.

[58]  Bernd Girod,et al.  Region-of-interest prediction for interactively streaming regions of high resolution video , 2007, Packet Video 2007.

[59]  Bernd Girod,et al.  Experiences with a large-scale deployment of Stanford Peer-to-Peer Multicast , 2009, 2009 17th International Packet Video Workshop.

[60]  Antonio Ortega,et al.  Distributed source coding techniques for interactive multiview video streaming , 2009, 2009 Picture Coding Symposium.

[61]  Aljoscha Smolic,et al.  Efficient representation and interactive streaming of high-resolution panoramic views , 2002, Proceedings. International Conference on Image Processing.

[62]  David Thaler,et al.  A dynamic bootstrap mechanism for rendezvous-based multicast routing , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[63]  Oliver Schreer,et al.  Virtual team user environments - a step from tele-cubicles towards distributed tele-collaboration in mediated workspaces , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[64]  S. B. Kang,et al.  Survey of image-based representations and compression techniques , 2003, IEEE Trans. Circuits Syst. Video Technol..

[65]  Bernd Girod,et al.  Compression-aware digital pan/tilt/zoom , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[66]  Wen Gao,et al.  Low-delay View Random Access for Multi-view Video Coding , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[67]  Peter Lambert,et al.  Flexible macroblock ordering as a content adaptation tool in H.264/AVC , 2005, SPIE Optics East.

[68]  Bernd Girod,et al.  Wyner-Ziv coding of light fields for random access , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[69]  Subbarayan Pasupathy,et al.  Predictive head movement tracking using a Kalman filter , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[70]  Thomas Sikora,et al.  Multi-view video streaming over P2P networks with low start-up delay , 2008, 2008 15th IEEE International Conference on Image Processing.

[71]  Bernd Girod,et al.  The Efficiency of Motion-Compensating Prediction for Hybrid Coding of Video Sequences , 1987, IEEE J. Sel. Areas Commun..

[72]  Michael F. Cohen,et al.  Capturing and viewing gigapixel images , 2007, ACM Trans. Graph..

[73]  Srinivasan Seshan,et al.  A case for end system multicast , 2002, IEEE J. Sel. Areas Commun..