Comparing Methods for Multi-class Probabilities in Medical Decision Making Using LS-SVMs and Kernel Logistic Regression

In this paper we compare thirteen different methods to obtain multi-class probability estimates in view of two medical case studies. The basic classification method used to implement all methods are least squares support vector machine (LS-SVM) classifiers. Results indicate that multi-class kernel logistic regression performs very well, together with a method based on ensembles of nested dichotomies. Also, a Bayesian LS-SVM method imposing sparseness performed very well for methods that combine binary probabilities into multi-class probabilities.

[1]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[2]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[3]  Johan A. K. Suykens,et al.  Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[4]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[5]  Carlos Soares,et al.  A Comparison of Ranking Methods for Classification Algorithm Selection , 2000, ECML.

[6]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[7]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[8]  Stefan Kramer,et al.  Ensembles of nested dichotomies for multi-class problems , 2004, ICML.

[9]  Wei Chu,et al.  Multi-category Classification by Soft-Max Combination of Binary Classifiers , 2003, Multiple Classifier Systems.

[10]  Sabine Van Huffel,et al.  Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines , 2003, Artif. Intell. Medicine.

[11]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[12]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[13]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[14]  T. Bourne,et al.  Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus opinion from the International Ovarian Tumor Analysis (IOTA) group , 2000, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[15]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[16]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[17]  Gavin C. Cawley,et al.  Leave-One-Out Cross-Validation Based Model Selection Criteria for Weighted LS-SVMs , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[18]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[19]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[20]  Johan A. K. Suykens,et al.  Multi-class kernel logistic regression: a fixed-size implementation , 2007, IJCNN.

[21]  Gérard Dreyfus,et al.  Pairwise Neural Network Classifiers with Probabilistic Outputs , 1994, NIPS.

[22]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[23]  Chih-Jen Lin,et al.  Generalized Bradley-Terry Models and Multi-Class Probability Estimates , 2006, J. Mach. Learn. Res..

[24]  Philippe Refregier,et al.  PROBABILISTIC APPROACH FOR MULTICLASS CLASSIFICATION WITH NEURAL NETWORKS , 1991 .