Convolutional Neural Network Based Inter-Frame Enhancement for 360-Degree Video Streaming

360-degree video has attracted more and more attention in recent years. However, it is a highly challenging task to transmit the high-resolution video within the limited bandwidth. In this paper, we first propose to unequally compress the cubemaps in each frame of the 360-degree video to reduce the total bitrate of the transmitted data. Specifically, a Group of Pictures (GOP) is used as a unit to alternately transmit different versions of the video. Each version consists of 3 high-quality cubemaps and 3 low-quality cubemaps. Then, the convolutional neural network (CNN) is introduced to enhance the low-quality cubemaps with the high-quality cubemaps by exploring the inter-frame similarities. It is shown in the experiment that a single CNN model can be used for various videos. The experimental results also show that the proposed method has an excellent quality enhancement compared with the benchmark in terms of PSNR, especially for videos with slow motion.

[1]  Marco Grangetto,et al.  Convolutional Neural Network for Intermediate View Enhancement in Multiview Streaming , 2018, IEEE Transactions on Multimedia.

[2]  Miska M. Hannuksela,et al.  HEVC-compliant Tile-based Streaming of Panoramic Video for Virtual Reality Applications , 2016, ACM Multimedia.

[3]  Miska M. Hannuksela,et al.  Viewport-Adaptive Encoding and Streaming of 360-Degree Video for Virtual Reality Applications , 2016, 2016 IEEE International Symposium on Multimedia (ISM).

[4]  Andrew Chi-Sing Leung,et al.  The Rhombic Dodecahedron Map: An Efficient Scheme for Encoding Panoramic Video , 2009, IEEE Transactions on Multimedia.

[5]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Feng Li,et al.  Ultra Wide View Based Panoramic VR Streaming , 2017, VR/AR Network@SIGCOMM.

[8]  Gwendal Simon,et al.  Viewport-adaptive navigable 360-degree video delivery , 2016, 2017 IEEE International Conference on Communications (ICC).

[9]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[10]  Bin Gu,et al.  Incremental Support Vector Learning for Ordinal Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Shing-Chow Chan,et al.  Data compression and transmission aspects of panoramic videos , 2005 .

[12]  Xiaoou Tang,et al.  Compression Artifacts Reduction by a Deep Convolutional Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yao Zhao,et al.  Depth map upsampling using joint edge-guided convolutional neural network for virtual view synthesizing , 2017, J. Electronic Imaging.

[15]  Aljoscha Smolic,et al.  Viewport-aware adaptive 360° video streaming using tiles for virtual reality , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[16]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[17]  Cornelius Hellge,et al.  Spatio-Temporal Activity based Tiling for Panorama Streaming , 2017, NOSSDAV.

[18]  Ling Shao,et al.  A rapid learning algorithm for vehicle classification , 2015, Inf. Sci..