Doubly Optimized Calibrated Support Vector Machine (DOC-SVM): An Algorithm for Joint Optimization of Discrimination and Calibration

Historically, probabilistic models for decision support have focused on discrimination, e.g., minimizing the ranking error of predicted outcomes. Unfortunately, these models ignore another important aspect, calibration, which indicates the magnitude of correctness of model predictions. Using discrimination and calibration simultaneously can be helpful for many clinical decisions. We investigated tradeoffs between these goals, and developed a unified maximum-margin method to handle them jointly. Our approach called, Doubly Optimized Calibrated Support Vector Machine (DOC-SVM), concurrently optimizes two loss functions: the ridge regression loss and the hinge loss. Experiments using three breast cancer gene-expression datasets (i.e., GSE2034, GSE2990, and Chanrion's datasets) showed that our model generated more calibrated outputs when compared to other state-of-the-art models like Support Vector Machine ( = 0.03,  = 0.13, and <0.001) and Logistic Regression ( = 0.006,  = 0.008, and <0.001). DOC-SVM also demonstrated better discrimination (i.e., higher AUCs) when compared to Support Vector Machine ( = 0.38,  = 0.29, and  = 0.047) and Logistic Regression ( = 0.38,  = 0.04, and <0.0001). DOC-SVM produced a model that was better calibrated without sacrificing discrimination, and hence may be helpful in clinical decision making.

[1]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[2]  Harald Steck,et al.  Hinge Rank Loss and the Area Under the ROC Curve , 2007, ECML.

[3]  Xiao-Hua Zhou,et al.  The need for reorientation toward cost‐effective prediction: Comments on ‘Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond’ by Pencina et al., Statistics in Medicine (DOI: 10.1002/sim.2929) , 2008, Statistics in medicine.

[4]  Jihoon Kim,et al.  A patient-driven adaptive prediction technique to improve personalized risk estimation for clinical decision support , 2012, J. Am. Medical Informatics Assoc..

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  G A Diamond,et al.  What price perfection? Calibration and discrimination of clinical prediction models. , 1992, Journal of clinical epidemiology.

[7]  Jihoon Kim,et al.  Effect of data combination on predictive modeling: a study using gene expression data. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[8]  Eyke Hüllermeier,et al.  Bipartite Ranking through Minimization of Univariate Loss , 2011, ICML.

[9]  Lucila Ohno-Machado,et al.  Translational bioinformatics: linking knowledge across biological and clinical realms , 2011, J. Am. Medical Informatics Assoc..

[10]  Shyam Visweswaran,et al.  The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data , 2011, J. Am. Medical Informatics Assoc..

[11]  A. H. Murphy A Note on the Ranked Probability Score , 1971 .

[12]  Jihoon Kim,et al.  Grid Binary LOgistic REgression (GLORE): building shared models without sharing data , 2012, J. Am. Medical Informatics Assoc..

[13]  A. H. Murphy,et al.  A new decomposition of the Brier score: formulation and interpretation , 1986 .

[14]  M. Degroot,et al.  Assessing Probability Assessors: Calibration and Refinement. , 1981 .

[15]  R. Purves,et al.  Optimum numerical integration methods for estimation of area-under-the-curve (AUC) and area-under-the-moment-curve (AUMC) , 1992, Journal of Pharmacokinetics and Biopharmaceutics.

[16]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[17]  Lucila Ohno-Machado,et al.  An improved model for predicting postoperative nausea and vomiting in ambulatory surgery patients using physician-modifiable risk factors , 2012, J. Am. Medical Informatics Assoc..

[18]  C. Mac Evilly,et al.  What price perfection? , 2001, New scientist.

[19]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[20]  Franck Molina,et al.  A Gene Expression Signature that Can Predict the Recurrence of Tamoxifen-Treated Primary Breast Cancer , 2008, Clinical Cancer Research.

[21]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[22]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[23]  J. Shavlik,et al.  Breast cancer risk estimation with artificial neural networks revisited , 2010, Cancer.

[24]  Jeff Bernhardt,et al.  Requirements for Calibration in Noninvasive Glucose Monitoring by Raman Spectroscopy , 2009, Journal of diabetes science and technology.

[25]  F. Sanders On Subjective Probability Forecasting , 1963 .

[26]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[27]  David G. Stork,et al.  Pattern Classification , 1973 .

[28]  Ilan Yaniv,et al.  Measures of Discrimination Skill in Probabilistic Judgment , 1991 .

[29]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[30]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[31]  Pedro Larrañaga,et al.  Machine learning: an indispensable tool in bioinformatics. , 2010, Methods in molecular biology.

[32]  D. Hosmer,et al.  A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.

[33]  Lucila Ohno-Machado,et al.  The use of receiver operating characteristic curves in biomedical informatics , 2005, J. Biomed. Informatics.

[34]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[35]  M. Ediger,et al.  Noninvasive Optical Screening for Diabetes , 2009, Journal of diabetes science and technology.

[36]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[37]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[38]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[39]  Jihoon Kim,et al.  Calibrating predictive model estimates to support personalized medicine , 2011, J. Am. Medical Informatics Assoc..

[40]  Howard Rockette,et al.  Statistical Evaluation of Diagnostic Performance: Topics in Roc Analysis , 2011 .

[41]  Zhu Wang,et al.  HingeBoost: ROC-Based Boost for Classification and Variable Selection , 2011 .