Mixing Tile Resolutions in Tiled Video: A Perceptual Quality Assessment

The mismatch between increasingly large video resolution and constrained screen size of mobile devices has led to the proposal of zoomable video systems based on tiled video. In the current system, a tiled video frame is constructed from multiple tiles in a single resolution stream. In this paper, we explore the perceptual effect of mixed-resolution tiles in tiled video, in which tiles within a video frame could come from streams with different resolutions, with the aim to tradeoff bandwidth and perceptual video quality. To understand how users perceive the video quality of mixed-resolution tiled video, we conducted a psychophysical study with 50 participants on tiled videos where the tile resolutions are randomly chosen from two resolution levels with equal probability. The experiment results show that in many cases, we can mix tiles from HD (1920×1080p) stream and tiles from 1600×900p stream without being noticed by the viewers. Even when participants notice quality degradation in videos combined with tiles from HD stream and tiles from 960×540p stream, the majority of participants still accept the degradation when viewing videos with low and medium motion; and greater than 40% of participants accept the quality degradation when viewing video with dense motion.

[1]  Wei Tsang Ooi,et al.  Supporting zoomable video streams with dynamic region-of-interest cropping , 2010, MMSys '10.

[2]  Wei Tsang Ooi,et al.  Adaptive encoding of zoomable video streams based on user access pattern , 2011, MMSys.

[3]  Aniruddha Sinha,et al.  Region-of-interest based compressed domain video transcoding scheme , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Bernd Girod,et al.  Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality , 2007, 2007 15th European Signal Processing Conference.

[5]  Haohong Wang,et al.  Joint Adaptive Background Skipping and Weighted Bit Allocation for Wireless Video Telephony , 2005, 2005 International Conference on Wireless Networks, Communications and Mobile Computing.

[6]  Hans Stokking,et al.  Spatial segmentation for immersive media delivery , 2011, 2011 15th International Conference on Intelligence in Next Generation Networks.

[7]  Warnakulasuriya Anil Chandana Fernando,et al.  3D video assessment with Just Noticeable Difference in Depth evaluation , 2010, 2010 IEEE International Conference on Image Processing.

[8]  Haohong Wang,et al.  Real-Time Region-of-Interest Video Coding Using Content-Adaptive Background Skipping With Dynamic Bit Reallocation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  G. Gescheider Psychophysics: The Fundamentals , 1997 .

[10]  Wu-chi Feng,et al.  Supporting region-of-interest cropping through constrained compression , 2008, ACM Multimedia.

[11]  Ruzena Bajcsy,et al.  Color-plus-depth level-of-detail in 3D tele-immersive video: a psychophysical approach , 2011, MM '11.

[12]  Bernd Girod,et al.  Region-of-interest prediction for interactively streaming regions of high resolution video , 2007, Packet Video 2007.