The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation

Regression analysis makes up a large part of supervised machine learning, and consists of the prediction of a continuous independent target from a set of other predictor variables. The difference between binary classification and regression is in the target range: in binary classification, the target can have only two values (usually encoded as 0 and 1), while in regression the target can have multiple values. Even if regression analysis has been employed in a huge number of machine learning studies, no consensus has been reached on a single, unified, standard metric to assess the results of the regression itself. Many studies employ the mean square error (MSE) and its rooted variant (RMSE), or the mean absolute error (MAE) and its percentage variant (MAPE). Although useful, these rates share a common drawback: since their values can range between zero and +infinity, a single value of them does not say much about the performance of the regression with respect to the distribution of the ground truth elements. In this study, we focus on two rates that actually generate a high score only if the majority of the elements of a ground truth group has been correctly predicted: the coefficient of determination (also known as R-squared or R2) and the symmetric mean absolute percentage error (SMAPE). After showing their mathematical properties, we report a comparison between R2 and SMAPE in several use cases and in two real medical scenarios. Our results demonstrate that the coefficient of determination (R-squared) is more informative and truthful than SMAPE, and does not have the interpretability limitations of MSE, RMSE, MAE and MAPE. We therefore suggest the usage of R-squared as standard metric to evaluate regression analyses in any scientific domain.

[1]  Fabrice Rossi,et al.  Mean Absolute Percentage Error for regression models , 2016, Neurocomputing.

[2]  L. Ren,et al.  Applicability of the Revised Mean Absolute Percentage Errors (MAPE) Approach to Some Popular Normal and Non-normal Independent Time Series , 2009 .

[3]  Alexei Botchkarev,et al.  A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms , 2019, Interdisciplinary Journal of Information, Knowledge, and Management.

[4]  J. Garibaldi,et al.  A new accuracy measure based on bounded relative error for time series forecasting , 2017, PloS one.

[5]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[6]  H. Piepho A coefficient of determination (R2) for generalized linear mixed models , 2019, Biometrical journal. Biometrische Zeitschrift.

[7]  J. Barrett The Coefficient of Determination—Some Limitations , 1974 .

[8]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[9]  Niklas Tötsch,et al.  Classifier uncertainty: evidence, potential impact, and probabilistic treatment , 2021, PeerJ Comput. Sci..

[10]  R. W. Farebrother,et al.  Further Results on the Mean Square Error of Ridge Regression , 1976 .

[11]  Roberto da Costa Quinino,et al.  Using the coefficient of determination R2 to test the significance of multiple linear regression , 2013 .

[12]  Onur Köksoy,et al.  Multiresponse robust design: Mean square error (MSE) criterion , 2006, Appl. Math. Comput..

[13]  Spyros Makridakis,et al.  Accuracy measures: theoretical and practical concerns☆ , 1993 .

[14]  N. Sdren Blomquist A Note on the Use of the Coefficient of Determination , 1980 .

[15]  Fabio Mendoza Palechor,et al.  Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from Colombia, Peru and Mexico , 2019, Data in brief.

[16]  D. Chicco,et al.  The Matthews Correlation Coefficient (MCC) is More Informative Than Cohen’s Kappa and Brier Score in Binary Classification Assessment , 2021, IEEE Access.

[17]  Alessandro Di Bucchianico,et al.  Coefficient of Determination (R2) , 2008 .

[18]  Edward R. Dougherty,et al.  Coefficient of determination in nonlinear signal processing , 2000, Signal Process..

[19]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[20]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[21]  Gaudenz Danuser,et al.  Linking data to models: data regression , 2006, Nature Reviews Molecular Cell Biology.

[22]  Michael B. Miller Linear Regression Analysis , 2013 .

[23]  Fabrice Rossi,et al.  Using the Mean Absolute Percentage Error for Regression Models , 2015, ESANN.

[24]  Benito E. Flores,et al.  A pragmatic view of accuracy measurement in forecasting , 1986 .

[25]  Mario V. Wüthrich,et al.  From Generalized Linear Models to Neural Networks, and Back , 2019 .

[26]  Baraka Jacob Maiseli Optimum design of chamfer masks using symmetric mean absolute percentage error , 2019, EURASIP J. Image Video Process..

[27]  Robert M. Hirsch,et al.  Mean square error of regression‐based constituent transport estimates , 1990 .

[28]  Mik Wisniewski,et al.  Applied Regression Analysis: A Research Tool , 1990 .

[29]  Ritik. S. Jain,et al.  Regression Analysis of COVID-19 using Machine Learning Algorithms , 2020, 2020 International Conference on Smart Electronics and Communication (ICOSEC).

[30]  Korbinian Brand,et al.  The Enhanced Liver Fibrosis (ELF) score: normal values, influence factors and proposed cut-off values. , 2013, Journal of hepatology.

[31]  M. Victoria-Feser,et al.  A Robust Coefficient of Determination for Regression , 2010 .

[32]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[33]  Philip M. Long,et al.  Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.

[34]  D. M. Allen Mean Square Error of Prediction as a Criterion for Selecting Variables , 1971 .

[35]  K. C. Ho,et al.  Simple Formulae for Bias and Mean Square Error Computation [DSP Tips and Tricks] , 2013, IEEE Signal Processing Magazine.

[36]  Apurbalal Senapati,et al.  A novel framework for COVID-19 case prediction through piecewise regression in India , 2020, International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management.

[37]  S. Menard Coefficients of Determination for Multiple Logistic Regression Analysis , 2000 .

[38]  Niklas Tötsch,et al.  The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation , 2021, BioData Min..

[39]  John A. Cornell,et al.  Factors that influence the value of the coefficient of determination in simple linear and nonlinear regression models , 1987 .

[40]  Jeremy N. V. Miles,et al.  R Squared, Adjusted R Squared† , 2005 .

[41]  Alexei Botchkarev,et al.  Evaluating Performance of Regression Machine Learning Models Using Multiple Error Metrics in Azure Machine Learning Studio , 2018 .

[42]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[43]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[44]  George Bernard Shaw,et al.  LONG-RANGE FORECASTING From Crystal Ball to Computer , 2010 .

[45]  R. Quentin Grafton,et al.  Mean Squared Error , 2017, Encyclopedia of Machine Learning and Data Mining.

[46]  Weijie Wang,et al.  Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model , 2018 .

[47]  N. Wermuth,et al.  A Comment on the Coefficient of Determination for Binary Responses , 1992 .

[48]  Giuseppe Jurman,et al.  An Ensemble Learning Approach for Enhanced Classification of Patients With Hepatitis and Cirrhosis , 2021, IEEE Access.

[49]  Naomi S. Altman,et al.  Points of Significance: Simple linear regression , 2015, Nature Methods.

[50]  R. Berk Regression Analysis: A Constructive Critique , 2003 .

[51]  Gregory R. Hancock,et al.  Improving the Root Mean Square Error of Approximation for Nonnormal Conditions in Structural Equation Modeling , 2000 .

[52]  H. Schielzeth,et al.  The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded , 2016, bioRxiv.

[53]  Giuseppe Jurman,et al.  The Benefits of the Matthews Correlation Coefficient (MCC) Over the Diagnostic Odds Ratio (DOR) in Binary Classification Assessment , 2021, IEEE Access.

[54]  Daniel J. Ozer,et al.  Correlation and the coefficient of determination , 1985 .

[55]  G. Hancock,et al.  EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT HANCOCK AND FREEMAN POWER AND SAMPLE SIZE FOR THE ROOT MEAN SQUARE ERROR OF APPROXIMATION TEST OF NOT CLOSE FIT IN STRUCTURAL EQUATION MODELING , 2001 .

[56]  P. Visscher,et al.  A Better Coefficient of Determination for Genetic Profile Analysis , 2012, Genetic epidemiology.

[57]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[58]  Philip H. Young Generalized Coefficient of Determination , 2000 .

[59]  P. M. Berthouex,et al.  The Coefficient of Determination, R^2 , 2002 .

[60]  Fred Pyrczak,et al.  Coefficient of Determination , 2018, Making Sense of Statistics.

[61]  B. V. Sukhatme,et al.  On the Bias and Mean Square Error of the Ratio Estimator , 1974 .

[62]  O. Sarbishei,et al.  Analysis of Mean-Square-Error (MSE) for fixed-point FFT units , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[63]  Yu Ryan Yue,et al.  Bayesian inference for additive mixed quantile regression models , 2011, Comput. Stat. Data Anal..

[64]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[65]  Edwin J Sarver,et al.  Visual Acuity as a Function of Zernike Mode and Level of Root Mean Square Error , 2003, Optometry and vision science : official publication of the American Academy of Optometry.

[66]  Y. Shoham,et al.  Mean Absolute Error , 2010, Encyclopedia of Machine Learning and Data Mining.

[67]  Sheldon H. Stein,et al.  Understanding Regression Analysis , 1990 .

[68]  Qi Wang,et al.  A Comprehensive Survey of Loss Functions in Machine Learning , 2020, Annals of Data Science.

[69]  Terri L. Moore,et al.  Regression Analysis by Example , 2001, Technometrics.

[70]  Roberto Morales-Ortega,et al.  Obesity Level Estimation Software based on Decision Trees , 2019, Journal of Computer Science.

[71]  Ken Kelley,et al.  Accuracy in Parameter Estimation for the Root Mean Square Error of Approximation: Sample Size Planning for Narrow Confidence Intervals , 2011, Multivariate behavioral research.

[72]  Jeffrey S. Simonoff,et al.  Handbook of Regression Analysis , 2012 .

[73]  Frank Klawonn,et al.  Using machine learning techniques to generate laboratory diagnostic pathways—a case study , 2018, Journal of Laboratory and Precision Medicine.

[74]  Calyampudi R. Rao Some Comments on the Minimum mean Square Error as a Criterion of Estimation. , 1980 .

[75]  Geert Ridder,et al.  Mean-Square-Error Calculations for Average Treatment Effects , 2005 .

[76]  Vladik Kreinovich,et al.  How to Estimate Forecasting Quality: A System- Motivated Derivation of Symmetric Mean Absolute Percentage Error (SMAPE) and Other Similar Characteristics , 2014 .

[77]  C. Willmott,et al.  Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance , 2005 .

[78]  Alexei Botchkarev,et al.  Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology , 2018, Interdisciplinary Journal of Information, Knowledge, and Management.

[79]  Richard A. Russell,et al.  The coefficient of determination: what determines a useful R² statistic? , 2012, Investigative ophthalmology & visual science.

[80]  Spyros Makridakis,et al.  The M3-Competition: results, conclusions and implications , 2000 .

[81]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[82]  Anil K. Srivastava,et al.  The coefficient of determination and its adjusted version in linear regression models , 1995 .

[83]  M. Shcherbakov,et al.  A Survey of Forecast Error Measures , 2013 .

[84]  Richard A. Berk Statistical Learning as a Regression Problem , 2016 .

[85]  David J. Olive,et al.  Introduction to Regression Analysis , 2007 .

[86]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[87]  Norman Fickel,et al.  Partition of the Coefficient of Determination in Multiple Regression , 2000 .

[88]  S. Nadarajah,et al.  Count regression models for COVID-19 , 2020, Physica A: Statistical Mechanics and its Applications.

[89]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[90]  P Raji,et al.  Covid-19 pandemic Analysis using Regression , 2020 .

[91]  Gerald J. Hahn,et al.  The coefficient of determination exposed ! , 2007 .

[92]  S. Lewis,et al.  Regression analysis , 2007, Practical Neurology.

[93]  Vili Podgorelec,et al.  Decision trees , 2018, Encyclopedia of Database Systems.

[94]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[95]  D. Chicco,et al.  The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation , 2020, BMC Genomics.

[96]  Dabao Zhang,et al.  A Coefficient of Determination for Generalized Linear Models , 2017 .

[97]  P. Goodwin,et al.  On the asymmetry of the symmetric MAPE , 1999 .

[98]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .