Performance measure of image and video quality assessment algorithms: subjective root-mean-square error

Abstract. Evaluating algorithms used to assess image and video quality requires performance measures. Traditional performance measures (e.g., Pearson’s linear correlation coefficient, Spearman’s rank-order correlation coefficient, and root mean square error) compare quality predictions of algorithms to subjective mean opinion scores (mean opinion score/differential mean opinion score). We propose a subjective root-mean-square error (SRMSE) performance measure for evaluating the accuracy of algorithms used to assess image and video quality. The SRMSE performance measure takes into account dispersion between observers. The other important property of the SRMSE performance measure is its measurement scale, which is calibrated to units of the number of average observers. The results of the SRMSE performance measure indicate the extent to which the algorithm can replace the subjective experiment (as the number of observers). Furthermore, we have presented the concept of target values, which define the performance level of the ideal algorithm. We have calculated the target values for all sample sets of the CID2013, CVD2014, and LIVE multiply distorted image quality databases.The target values and MATLAB implementation of the SRMSE performance measure are available on the project page of this study.

[1]  Damon M. Chandler,et al.  ${\bf S}_{3}$: A Spectral and Spatial Measure of Local Perceived Sharpness in Natural Images , 2012, IEEE Transactions on Image Processing.

[2]  Stefan Winkler,et al.  Analysis of Public Image and Video Databases for Quality Assessment , 2012, IEEE Journal of Selected Topics in Signal Processing.

[3]  Stefan Winkler,et al.  On the properties of subjective ratings in video quality experiments , 2009, 2009 International Workshop on Quality of Multimedia Experience.

[4]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[5]  David S. Doermann,et al.  No-Reference Image Quality Assessment Using Visual Codebooks , 2012, IEEE Transactions on Image Processing.

[6]  Margaret H. Pinson,et al.  The history of video quality model validation , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[7]  Weisi Lin,et al.  A Psychovisual Quality Metric in Free-Energy Principle , 2012, IEEE Transactions on Image Processing.

[8]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.

[9]  Stefan Winkler Does inter-subject variability depend on test material? , 2014, 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX).

[10]  Damon M. Chandler,et al.  No-Reference Quality Assessment of JPEG Images via a Quality Relevance Map , 2014, IEEE Signal Processing Letters.

[11]  Amy R. Reibman A strategy to jointly test image quality estimators subjectively , 2012, 2012 19th IEEE International Conference on Image Processing.

[12]  Alan C. Bovik,et al.  Objective quality assessment of multiply distorted images , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[13]  Nikolay N. Ponomarenko,et al.  TID2008 – A database for evaluation of full-reference visual quality assessment metrics , 2004 .

[14]  Lina J. Karam,et al.  A no-reference perceptual image sharpness metric based on a cumulative probability of blur detection , 2009, 2009 International Workshop on Quality of Multimedia Experience.

[15]  Mikko Nuutinen,et al.  Alternative performance metrics and target values for the CID2013 database , 2015, Electronic Imaging.

[16]  Jun Gao,et al.  Learning to predict the perceived visual quality of photos , 2011, 2011 International Conference on Computer Vision.

[17]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[18]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[19]  Mikko Nuutinen,et al.  Image feature subsets for predicting the quality of consumer camera images and identifying quality dimensions , 2014, J. Electronic Imaging.

[20]  Phong V. Vu,et al.  A Fast Wavelet-Based Algorithm for Global and Local Image Sharpness Estimation , 2012, IEEE Signal Processing Letters.

[21]  Zhou Wang,et al.  Image Sharpness Assessment Based on Local Phase Coherence , 2013, IEEE Transactions on Image Processing.

[22]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[23]  Orly Yadid-Pecht,et al.  Quaternion Structural Similarity: A New Quality Index for Color Images , 2012, IEEE Transactions on Image Processing.

[24]  Christophe Charrier,et al.  Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain , 2012, IEEE Transactions on Image Processing.

[25]  Damon M. Chandler,et al.  No-reference image quality assessment based on log-derivative statistics of natural scenes , 2013, J. Electronic Imaging.

[26]  Weisi Lin,et al.  Fourier Transform-Based Scalable Image Quality Measure , 2012, IEEE Transactions on Image Processing.

[27]  Patrick Le Callet,et al.  Pseudo no reference image quality metric using perceptual data hiding , 2006, Electronic Imaging.

[28]  Eli Peli,et al.  Factors Affecting Enhanced Video Quality Preferences , 2013, IEEE Transactions on Image Processing.

[29]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.

[30]  Guangming Shi,et al.  Perceptual Quality Metric With Internal Generative Mechanism , 2013, IEEE Transactions on Image Processing.

[31]  Rand R. Wilcox,et al.  Basic Statistics: Understanding Conventional Methods and Modern Insights , 2009 .

[32]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[33]  Alan C. Bovik,et al.  A Two-Step Framework for Constructing Blind Image Quality Indices , 2010, IEEE Signal Processing Letters.

[34]  Stefan Winkler,et al.  Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives , 2016, Multimedia Systems.

[35]  Frank M. Ciaramello,et al.  Systematic stress testing of image quality estimators , 2011, 2011 18th IEEE International Conference on Image Processing.

[36]  Alan C. Bovik,et al.  Assessment of video naturalness using time-frequency statistics , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[37]  Doron Shaked,et al.  Measuring the Quality of Quality Measures , 2011, IEEE Transactions on Image Processing.