Panoramic video assessment based on cascaded network using saliency map

Virtual reality (VR) refers to a technology that allows people to experience the virtual world in an artificial environment. As one of the most important forms of VR media content, panoramic video can provide viewers with 360-degree free viewing angles. However, the acquisition, stitching, transmission and playback of panoramic video may damage the video quality and seriously affect the viewer's quality of experience. Therefore, how to improve the display quality and provide users with a better visual experience has become a hot topic in this field. When watching the videos, people pay attention to the salient areas, especially for the panoramic videos that people can choose the regions of interest freely. Considering this characteristic, the saliency information needs to be utilized when performing quality assessment. In this paper, we use two cascaded networks to calculate the quality score of panoramic video without reference video. First, the saliency prediction network is used to compute the saliency map of the image, and the patches with higher saliency are selected through the saliency map. In this way, we can exclude the areas in the panoramic image that have no positive effect on the quality assessment task. Then, we input the selected small salient patches into the quality assessment network for prediction, and obtain the final image quality score. Experimental results show that the proposed method can achieve more accurate quality scores for the panoramic videos compared with the state-of-the-art works due to its special network structure.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Chen Li,et al.  Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model , 2018, ACM Multimedia.

[3]  Cheng-Hsin Hsu,et al.  Towards Quality-of-Experience Models for Watching 360° Videos in Head-Mounted Virtual Reality , 2019, 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX).

[4]  Chen Li,et al.  A subjective visual quality assessment method of panoramic videos , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[5]  Jan Kautz,et al.  Loss Functions for Image Restoration With Neural Networks , 2017, IEEE Transactions on Computational Imaging.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Wei Zhou,et al.  Stereoscopic Omnidirectional Image Quality Assessment Based on Predictive Coding Theory , 2019, IEEE Journal of Selected Topics in Signal Processing.

[8]  Bin Jiang,et al.  3D Panoramic Virtual Reality Video Quality Assessment Based on 3D Convolutional Neural Networks , 2018, IEEE Access.

[9]  Shu Yang,et al.  An objective assessment method based on multi-level factors for panoramic videos , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[10]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Bernd Girod,et al.  A Framework to Evaluate Omnidirectional Video Coding Schemes , 2015, 2015 IEEE International Symposium on Mixed and Augmented Reality.

[12]  H. Shum,et al.  Data compression and transmission aspects of panoramic videos , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Touradj Ebrahimi,et al.  Saliency Driven Perceptual Quality Metric for Omnidirectional Visual Content , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[14]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[15]  Rita Cucchiara,et al.  A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[16]  Zhou Wang,et al.  Spherical Structural Similarity Index for Objective Omnidirectional Video Quality Assessment , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[17]  Yong Man Ro,et al.  Deep Virtual Reality Image Quality Assessment With Human Perception Guider for Omnidirectional Image , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  David Zhang,et al.  A comprehensive evaluation of full reference image quality assessment algorithms , 2012, 2012 19th IEEE International Conference on Image Processing.

[19]  Gordon Wetzstein,et al.  Saliency in VR: How Do People Explore Virtual Environments? , 2016, IEEE Transactions on Visualization and Computer Graphics.

[20]  Vladyslav Zakharchenko,et al.  Quality metric for spherical panoramic video , 2016, Optical Engineering + Applications.

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Lu Yu,et al.  Weighted-to-Spherically-Uniform Quality Evaluation for Omnidirectional Video , 2017, IEEE Signal Processing Letters.