Efficient Encoding of Interactive Personalized Views Extracted from Immersive Video Content

Traditional television limits people to a single viewpoint. However, with new technologies such as virtual reality glasses, the way in which people experience video will change. Instead of being limited to a single viewpoint, people will demand a more immersive experience that gives them a sense of being present in a sports stadium, a concert hall, or at other events. To satisfy these users, video such as 360-degree or panoramic video needs to be transported to their homes. Since these videos have an extremely high resolution, sending the entire video requires a high bandwidth capacity and also results in a high decoding complexity at the viewer. The traditional approach to this problem is to split the original video into tiles and only send the required tiles to the viewer. However, this approach still has a large bit rate overhead compared to sending only the required view. Therefore, we propose to send only a personalized view to each user. Since this paper focuses on reducing the computational cost of such a system, we accelerate the encoding of each personalized view based on coding information obtained from a pre-analysis on the entire ultra-high-resolution video. By doing this using the High Efficiency Video Coding Test Model (HM), the complexity of each individual encode of a personalized view is reduced by more than 96.5% compared to a full encode of the view. This acceleration results in a bit rate overhead of at most 19.5%, which is smaller compared to the bit rate overhead of the tile-based method.

[1]  Sangwook Lee,et al.  Comparison of subjective video quality assessment methods for multimedia applications , 2007 .

[2]  Jorrit van den Berg,et al.  Using MPEG DASH SRD for zoomable and navigable video , 2016, MMSys.

[3]  Carsten Griwodz,et al.  Tiling of panorama video for interactive virtual cameras: Overheads and potential bandwidth requirement reduction , 2015, 2015 Picture Coding Symposium (PCS).

[4]  Satoshi Goto,et al.  Image segmentation approach for realizing zoomable streaming HEVC video , 2013, 2013 9th International Conference on Information, Communications & Signal Processing.

[5]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Wei Tsang Ooi,et al.  Adaptive encoding of zoomable video streams based on user access pattern , 2011, MMSys.

[7]  Cyril Concolato,et al.  Tiled-based adaptive streaming using MPEG-DASH , 2016, MMSys.

[8]  Bernd Girod,et al.  Spatial-Random-Access-Enabled Video Coding for Interactive Virtual Pan/Tilt/Zoom Functionality , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Rik Van de Walle,et al.  Fast encoding for personalized views extracted from beyond high definition content , 2015, 2015 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting.

[10]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[11]  Wei Tsang Ooi,et al.  Supporting zoomable video streams with dynamic region-of-interest cropping , 2010, MMSys '10.

[12]  Jean-François Macq,et al.  Interactive omnidirectional video delivery: A bandwidth-effective approach , 2012, Bell Labs Technical Journal.