Cost-sensitive performance metric for comparing multiple ordinal classifiers

The surge of interest in personalized and precision medicine during recent years has increased the application of ordinal classification problems in biomedical science. Currently, accuracy, Kendall's τb , and average mean absolute error are three commonly used metrics for evaluating the effectiveness of an ordinal classifier. Although there are benefits to each, no single metric considers the benefits of predictive accuracy with the tradeoffs of misclassification cost. In addition, decision analysis that considers pairwise analysis of the metrics is not trivial due to inconsistent findings. A new cost-sensitive metric is proposed to find the optimal tradeoff between the two most critical performance measures of a classification task - accuracy and cost. The proposed method accounts for an inherent ordinal data structure, total misclassification cost of a classifier, and imbalanced class distribution. The strengths of the new methodology are demonstrated through analyses of three real cancer datasets and four simulation studies. The new cost-sensitive metric proved better performance in its ability to identify the best ordinal classifier for a given analysis. The performance metric devised in this study provides a comprehensive tool for comparative analysis of multiple (and competing) ordinal classifiers. Consideration of the tradeoff between accuracy and misclassification cost in decisions regarding ordinal classification problems is imperative in real-world application. The work presented here is a precursor to the possibility of incorporating the proposed metric into a prediction modeling algorithm for ordinal data as a means of integrating misclassification cost in final model selection.

[1]  Giuliano Galimberti,et al.  Classification Trees for Ordinal Responses in R: The rpartScore Package , 2012 .

[2]  K. Archer,et al.  L 1 penalized continuation ratio models for ordinal response prediction using high‐dimensional datasets , 2012, Statistics in medicine.

[3]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[4]  Kellie J Archer,et al.  rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response. , 2010, Journal of statistical software.

[5]  Robert H. Somers,et al.  The rank analogue of product-moment partial correlation and regression, with application to manifold, ordered contingency tables , 1959 .

[6]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[7]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[8]  José Hernández-Orallo,et al.  An experimental comparison of performance measures for classification , 2009, Pattern Recognit. Lett..

[9]  Jaime S. Cardoso,et al.  The unimodal model for the classification of ordinal data , 2008, Neural Networks.

[10]  Bernard De Baets,et al.  A Comparison of Dierent ROC Measures for Ordinal Regression , 2006 .

[11]  Pedro Antonio Gutiérrez,et al.  A preliminary study of ordinal metrics to guide a multi-objective evolutionary algorithm , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[12]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[13]  Chuhsing Kate Hsiao,et al.  Identification of a Novel Biomarker, SEMA5A, for Non–Small Cell Lung Carcinoma in Nonsmoking Women , 2010, Cancer Epidemiology, Biomarkers & Prevention.

[14]  Ling Li,et al.  Reduction from Cost-Sensitive Ordinal Ranking to Weighted Binary Classification , 2012, Neural Computation.

[15]  T. Yeatman,et al.  Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. , 2010, Gastroenterology.

[16]  Pedro Antonio Gutiérrez,et al.  Weighting Efficient Accuracy and Minimum Sensitivity for Evolving Multi-Class Classifiers , 2011, Neural Processing Letters.

[17]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[18]  Sotiris B. Kotsiantis,et al.  A Cost Sensitive Technique for Ordinal Classification Problems , 2004, SETN.

[19]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[20]  Tim Beißbarth,et al.  Utilization of ordinal response structures in classification with high-dimensional expression data , 2013, GCB.

[21]  Klaus Hechenbichler,et al.  Weighted k-Nearest-Neighbor Techniques and Ordinal Classification , 2004 .

[22]  Andrea Esuli,et al.  Evaluation Measures for Ordinal Regression , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[23]  Kenneth Lange,et al.  Numerical analysis for statisticians , 1999 .

[24]  Nathalie Japkowicz,et al.  Evaluation Methods for Ordinal Classification , 2009, Canadian Conference on AI.

[25]  Lee-Jen Wei,et al.  Treatment selections using risk-benefit profiles based on data from comparative randomized clinical trials with multiple endpoints. , 2015, Biostatistics.

[26]  Jaime S. Cardoso,et al.  Measuring the Performance of Ordinal Classification , 2011, Int. J. Pattern Recognit. Artif. Intell..

[27]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .