Logistic push: a regression framework for partial AUC optimization

The area under the receiver operating characteristic curve (AUC) is often used to evaluate the performance of clinical prediction models. Recently, a more refined strategy has been proposed to examine a partial area under the curve (pAUC), which can account for differing costs associated with false negative versus false positive results. Such consideration can substantially increase the clinical utility of prediction models depending on the clinical question. Properties of the pAUC estimator create significant challenges for pAUC-optimal marker selection and model building. As such, current approaches towards these aims can be complex and computationally intensive. We present a simpler method based on weighted logistic regressions. We refer to our strategy as logistic push, due to shared heuristics with the ranking algorithm P-norm push. Logistic push is particularly useful in the high-dimensional setting, where fast and broadly available algorithms for fitting penalized regressions can be used for both marker selection and model fitting.

[1]  M. Schummer,et al.  Selecting Differentially Expressed Genes from Microarray Experiments , 2003, Biometrics.

[2]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[3]  Shinto Eguchi,et al.  A boosting method for maximizing the partial area under the ROC curve , 2010, BMC Bioinformatics.

[4]  Cynthia Rudin,et al.  The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[5]  Tianxi Cai,et al.  Regression Analysis for the Partial Area Under the ROC Curve , 2006 .

[6]  W Zucchini,et al.  On the statistical analysis of ROC curves. , 1989, Statistics in medicine.

[7]  Shivani Agarwal,et al.  The Infinite Push: A New Support Vector Ranking Algorithm that Directly Optimizes Accuracy at the Absolute Top of the List , 2011, SDM.

[8]  Zhanfeng Wang,et al.  Marker selection via maximizing the partial area under the ROC curve of linear risk scores. , 2011, Biostatistics.

[9]  T. Cai,et al.  Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve , 2006, Biometrics.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  S D Walter,et al.  The partial area under the summary ROC curve , 2005, Statistics in medicine.

[12]  David Gur,et al.  On use of partial area under the ROC curve for evaluation of diagnostic performance , 2013, Statistics in medicine.

[13]  M S Pepe,et al.  Evaluating technologies for classification and prediction in medicine , 2005, Statistics in medicine.

[14]  Lennart Franzén,et al.  How well does the Gleason score predict prostate cancer death? A 20-year followup of a population based cohort in Sweden. , 2006, The Journal of urology.

[15]  Art B. Owen,et al.  Infinitely Imbalanced Logistic Regression , 2007, J. Mach. Learn. Res..

[16]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[17]  Stuart G. Baker,et al.  A Proposed Design and Analysis for Comparing Digital and Analog Mammography , 2001 .

[18]  Harikrishna Narasimhan,et al.  SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound , 2013, KDD.

[19]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[20]  Harikrishna Narasimhan,et al.  A Structural SVM Based Approach for Optimizing Partial AUC , 2013, ICML.

[21]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[22]  Huey-Miin Hsueh,et al.  The linear combinations of biomarkers which maximize the partial area under the ROC curves , 2013, Comput. Stat..

[23]  M. Pepe,et al.  Combining diagnostic test results to increase accuracy. , 2000, Biostatistics.

[24]  Ruth Etzioni,et al.  Overdiagnosis and overtreatment of prostate cancer. , 2014, European urology.

[25]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[26]  Torsten Hothorn,et al.  A PAUC-based Estimation Technique for Disease Classification and Biomarker Selection , 2012, Statistical applications in genetics and molecular biology.

[27]  Lori E. Dodd,et al.  Partial AUC Estimation and Regression , 2003, Biometrics.

[28]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[29]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[30]  Jian Huang,et al.  Regularized ROC method for disease classification and biomarker selection with microarray data , 2005, Bioinform..