No-reference omnidirectional video quality assessment based on generative adversarial networks

Omnidirectional video quality assessment (OVQA) helps to evaluate the viewers’ visual experience and promotes the development of omnidirectional video. The perceived quality of omnidirectional video is affected not only by the video content and distortion, but also by the viewing directions of viewer’s preference. At present, there are some quality assessment methods for omnidirectional video, but most of them are full-reference (FR). Compared with the FR method, the no-reference (NR) method becomes more difficult due to the lack of the reference video. In this paper, a NR OVQA based on generative adversarial networks (GAN) is proposed, which is composed of the reference video generator and the quality score predictor. Generally, a reference image/video is distorted by some distortion types, and each distortion type has some distortion levels. To the best of our knowledge, there are some NR methods using GAN to generate the reference images/videos for quality assessment. In these methods, the reference images/videos is generated by GAN when the distorted images/videos, which are from a distortion type but with different distortion level, are input into the GAN. In order to achieve an accurate quality assessment, the generated reference images/videos, which are from a distortion type but with different distortion level, are expected to have as similar quality as possible with each other. However, the distorted images/videos are independent for GAN, and the GAN will generate a little different reference images/videos for these distorted images/videos. This issue is not considered in the existing GAN-based methods. To solve this issue, we introduced a level loss in OVQA. For the quality score predictor, as a further contribution of this paper, the viewing direction of the omnidirectional video is incorporated to guide the quality and weight regression. The publicly available dataset is used to evaluate the proposed method. The experimental results indicate the effectiveness of the proposed method.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[3]  Jianbing Shen,et al.  Triplet Loss in Siamese Network for Object Tracking , 2018, ECCV.

[4]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Yizhou Wang,et al.  RAN4IQA: Restorative Adversarial Nets for No-Reference Image Quality Assessment , 2017, AAAI.

[6]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Jianbing Shen,et al.  Local Semantic Siamese Networks for Fast Tracking , 2019, IEEE Transactions on Image Processing.

[9]  Xinbo Gao,et al.  Objective Video Quality Assessment Combining Transfer Learning With CNN , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Yi Li,et al.  Convolutional Neural Networks for No-Reference Image Quality Assessment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Vladyslav Zakharchenko,et al.  Quality metric for spherical panoramic video , 2016, Optical Engineering + Applications.

[12]  Ling Shao,et al.  Visual Object Tracking by Hierarchical Attention Siamese Network , 2020, IEEE Transactions on Cybernetics.

[13]  Jiri Matas,et al.  DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Yong Man Ro,et al.  Deep Virtual Reality Image Quality Assessment With Human Perception Guider for Omnidirectional Image , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Bin Jiang,et al.  3D Panoramic Virtual Reality Video Quality Assessment Based on 3D Convolutional Neural Networks , 2018, IEEE Access.

[16]  Narciso García,et al.  Evaluating the Influence of the HMD, Usability, and Fatigue in 360VR Video Quality Assessments , 2020, 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW).

[17]  Jiachen Yang,et al.  Stereoscopic video quality assessment based on 3D convolutional neural networks , 2018, Neurocomputing.

[18]  Kwan-Yee Lin,et al.  Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Wenguan Wang,et al.  Deep Visual Attention Prediction , 2017, IEEE Transactions on Image Processing.

[20]  Wen Gao,et al.  Reduced-Reference Quality Assessment of Screen Content Images , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  Zhou Wang,et al.  Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Bernd Girod,et al.  A Framework to Evaluate Omnidirectional Video Coding Schemes , 2015, 2015 IEEE International Symposium on Mixed and Augmented Reality.

[23]  Chen Li,et al.  Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model , 2018, ACM Multimedia.

[24]  Sebastian Bosse,et al.  Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment , 2016, IEEE Transactions on Image Processing.

[25]  Sumohana S. Channappayya,et al.  No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics , 2020, IEEE Transactions on Image Processing.

[26]  Guangtao Zhai,et al.  MC360IQA: A Multi-channel CNN for Blind 360-Degree Image Quality Assessment , 2020, IEEE Journal of Selected Topics in Signal Processing.

[27]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Haibin Ling,et al.  A Deep Network Solution for Attention and Aesthetics Aware Photo Cropping , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Rajiv Soundararajan,et al.  Study of Subjective and Objective Quality Assessment of Video , 2010, IEEE Transactions on Image Processing.

[30]  Kai-Kuang Ma,et al.  ESIM: Edge Similarity for Screen Content Image Quality Assessment , 2017, IEEE Transactions on Image Processing.

[31]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[32]  Zulin Wang,et al.  Assessing Visual Quality of Omnidirectional Videos , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Xiaogang Jin,et al.  Quadruplet Network With One-Shot Learning for Fast Visual Object Tracking , 2017, IEEE Transactions on Image Processing.

[34]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.