Perceptual Evaluation of 360 Audiovisual Quality and Machine Learning Predictions

In an earlier study, we gathered perceptual evaluations of the audio, video, and audiovisual quality for 360 audiovisual content. This paper investigates perceived audiovisual quality prediction based on objective quality metrics and subjective scores of 360 video and spatial audio content. Thirteen objective video quality metrics and three objective audio quality metrics were evaluated for five stimuli for each coding parameter. Four regression-based machine learning models were trained and tested here, i.e., multiple linear regression, decision tree, random forest, and support vector machine. Each model was constructed using a combination of audio and video quality metrics and two cross-validation methods (k-Fold and Leave-One-Out) were investigated and produced 312 predictive models. The results indicate that the model based on the evaluation of VMAF and AMBIQUAL is better than other combinations of audio-video quality metric. In this study, support vector machine provides higher performance using k-Fold (PCC = 0.909, SROCC = 0.914, and RMSE = 0.416). These results can provide insights for the design of multimedia quality metrics and the development of predictive models for audiovisual omnidirectional media.

[1]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[2]  Anh T. Pham,et al.  A Study on Quality Metrics for 360 Video Communications , 2018, IEICE Trans. Inf. Syst..

[3]  Tiago H. Falk,et al.  Audio-Visual Multimedia Quality Assessment: A Comprehensive Survey , 2017, IEEE Access.

[4]  Andrew Hines,et al.  Objective Assessment of Perceptual Audio Quality Using ViSQOLAudio , 2017, IEEE Transactions on Broadcasting.

[5]  Jan Skoglund,et al.  Auditory Localization in Low Bitrate Compressed Ambisonic Scenes , 2019 .

[6]  Andrew Hines,et al.  ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric , 2020, 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX).

[7]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[8]  Andrew Hines,et al.  AMBIQUAL: Towards a Quality Metric for Headphone Rendered Compressed Ambisonic Spatial Audio , 2020 .

[9]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[10]  Michael J. Gerzon Periphony: With-Height Sound Reproduction , 1973 .

[11]  Soren Forchhammer,et al.  Towards a Perceived Audiovisual Quality Model for Immersive Content , 2020, 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX).

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Jean-Charles Grégoire,et al.  Machine Learning--Based Parametric Audiovisual Quality Prediction Models for Real-Time Communications , 2017, ACM Trans. Multim. Comput. Commun. Appl..

[14]  Lu Yu,et al.  Weighted-to-Spherically-Uniform Quality Evaluation for Omnidirectional Video , 2017, IEEE Signal Processing Letters.

[15]  Miska M. Hannuksela,et al.  Perceptual-based quality assessment for audio-visual services: A survey , 2010, Signal Process. Image Commun..

[16]  Jing Wang,et al.  Subjective QoE of 360-Degree Virtual Reality Videos and Machine Learning Predictions , 2020, IEEE Access.

[17]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[18]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[19]  Stephan Fremerey,et al.  Subjective Test Dataset and Meta-data-based Models for 360° Streaming Video Quality , 2020, 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP).

[20]  Narciso García,et al.  Video Multimethod Assessment Fusion (VMAF) on 360VR Contents , 2019, IEEE Transactions on Consumer Electronics.

[21]  Thorsten Kastner,et al.  Objective Measures of Perceptual Audio Quality Reviewed: An Evaluation of Their Application Domain Dependence , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Zhan Ma,et al.  Modeling the Perceptual Quality of Viewport Adaptive Omnidirectional Video Streaming , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Bernd Girod,et al.  A Framework to Evaluate Omnidirectional Video Coding Schemes , 2015, 2015 IEEE International Symposium on Mixed and Augmented Reality.

[24]  Kurt Hornik,et al.  Support Vector Machines in R , 2006 .

[25]  Xiang Xie,et al.  QoE Evaluation Methods for 360-Degree VR Video Transmission , 2020, IEEE Journal of Selected Topics in Signal Processing.

[26]  Vladyslav Zakharchenko,et al.  Quality metric for spherical panoramic video , 2016, Optical Engineering + Applications.