Adaptive encoding of zoomable video streams based on user access pattern

Zoomable video allows users to selectively zoom and pan into regions of interest within the video for viewing at higher resolutions. Such interaction requires dynamic cropping of RoIs on the source video. We have previously explored two different ways of encoding and transmitting video to support dynamic RoI cropping: (i) Monolithic streaming uses a standard video encoder to encode the video. When an RoI is requested, the bits belonging to the RoI along with other bits required to decode the RoIs (due to encoding dependencies) are transmitted. (ii) Tile streaming divides regions in the standard video into rectangular tiles that are encoded independently. The tiles that intersect with a requested RoI are transmitted. In this paper, we consider how the bandwidth needed to transmit the RoIs can be reduced by carefully encoding the source video for each of the two encoding schemes. The goal is to support bandwidth efficient compressed domain RoI cropping in the context of virtual zoom and pan by tuning encoder parameters. Our key idea is to exploit user access patterns to the RoIs, and encode different regions of the video with different encoding parameters based on the popularity of the region. We show that our encoding method can reduce the expected bandwidth by up to 43% in the test video sequence which we have used.

[1]  Lester C. Loschky,et al.  How late can you update gaze-contingent multiresolutional displays without detection? , 2007, TOMCCAP.

[2]  Wei-Ying Ma,et al.  A content-based bit allocation model for video streaming , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[3]  David Salesin,et al.  Gaze-based interaction for semi-automatic photo cropping , 2006, CHI.

[4]  Wu-chi Feng,et al.  Supporting region-of-interest cropping through constrained compression , 2008, ACM Multimedia.

[5]  Lester C. Loschky,et al.  Reduced saliency of peripheral targets in gaze-contingent multi-resolutional displays: blended versus sharp boundary windows , 2002, ETRA.

[6]  Xing Xie,et al.  Learning user interest for image browsing on small-form-factor devices , 2005, CHI.

[7]  Wei Tsang Ooi,et al.  Towards characterizing users' interaction with zoomable video , 2010, SAPMIA '10.

[8]  Xing Xie,et al.  A visual attention model for adapting images on small displays , 2003, Multimedia Systems.

[9]  Bernd Girod,et al.  Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality , 2007, 2007 15th European Signal Processing Conference.

[10]  Wenjun Zeng,et al.  Scalable streaming of JPEG2000 images using hypertext transfer protocol , 2001, MULTIMEDIA '01.

[11]  Chung-Ming Huang,et al.  Multiple-priority region-of-interest H.264 video compression using constraint variable bitrate control for video surveillance , 2009 .

[12]  Wei Tsang Ooi,et al.  Adaptive encoding of zoomable video streams based on user access pattern , 2011, MMSys.

[13]  Xing Xie,et al.  Automatic browsing of large pictures on mobile devices , 2003, MULTIMEDIA '03.

[14]  Bernd Girod,et al.  Background extraction and long-term memory motion-compensated prediction for spatial-random-access-enabled video coding , 2009, 2009 Picture Coding Symposium.

[15]  W.A.C. Fernando,et al.  Region of Interest Video Coding with Flexible Macroblock Ordering , 2006, First International Conference on Industrial and Information Systems.

[16]  Hideaki Kimata,et al.  Interactive panoramic video streaming system over restricted bandwidth network , 2010, ACM Multimedia.

[17]  Wei Tsang Ooi,et al.  Supporting zoomable video streams with dynamic region-of-interest cropping , 2010, MMSys '10.

[18]  Bernd Girod,et al.  Region-of-interest prediction for interactively streaming regions of high resolution video , 2007, Packet Video 2007.

[19]  Qi Tian,et al.  Content-adaptive digital music watermarking based on music structure analysis , 2007, TOMCCAP.

[20]  T. Wiegand,et al.  REPRESENTATION, CODING AND INTERACTIVE RENDERING OF HIGH- RESOLUTION PANORAMIC IMAGES AND VIDEO USING MPEG-4 , 2005 .

[21]  Marta Karczewicz,et al.  The SP- and SI-frames design for H.264/AVC , 2003, IEEE Trans. Circuits Syst. Video Technol..