Reliable and computationally efficient maximum-likelihood estimation of "proper" binormal ROC curves.

RATIONALE AND OBJECTIVES Estimation of ROC curves and their associated indices from experimental data can be problematic, especially in multireader, multicase (MRMC) observer studies. Wilcoxon estimates of area under the curve (AUC) can be strongly biased with categorical data, whereas the conventional binormal ROC curve-fitting model may produce unrealistic fits. The "proper" binormal model (PBM) was introduced by Metz and Pan to provide acceptable fits for both sturdy and problematic datasets, but other investigators found that its first software implementation was numerically unstable in some situations. Therefore, we created an entirely new algorithm to implement the PBM. MATERIALS AND METHODS This paper describes in detail the new PBM curve-fitting algorithm, which was designed to perform successfully in all problematic situations encountered previously. Extensive testing was conducted also on a broad variety of simulated and real datasets. Windows, Linux, and Apple Macintosh OS X versions of the algorithm are available online at http://xray.bsd.uchicago.edu/krl/. RESULTS Plots of fitted curves as well as summaries of AUC estimates and their standard errors are reported. The new algorithm never failed to converge and produced good fits for all of the several million datasets on which it was tested. For all but the most problematic datasets, the algorithm also produced very good estimates of AUC standard error. The AUC estimates compared well with Wilcoxon estimates for continuously distributed data and are expected to be superior for categorical data. CONCLUSION This implementation of the PBM is reliable in a wide variety of ROC curve-fitting tasks.

[1]  Matthew A. Kupinski,et al.  Ideal observers and optimal ROC hypersurfaces in N-class classification , 2004, IEEE Transactions on Medical Imaging.

[2]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[3]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[4]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[5]  J. Hanley Receiver operating characteristic (ROC) methodology: the state of the art. , 1989, Critical reviews in diagnostic imaging.

[6]  M GayDavid,et al.  Algorithm 611: Subroutines for Unconstrained Minimization Using a Model/Trust-Region Approach , 1983 .

[7]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[8]  K. S. Banerjee Generalized Inverse of Matrices and Its Applications , 1973 .

[9]  K. Doi,et al.  Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists' detection of pulmonary nodules. , 2000, AJR. American journal of roentgenology.

[10]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[11]  Kevin S. Berbaum,et al.  A contaminated binormal model for ROC data , 2000 .

[12]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[13]  K S Berbaum,et al.  A contaminated binormal model for ROC data: Part III. Initial evaluation with detection ROC data. , 2000, Academic radiology.

[14]  Stanton A. Glantz,et al.  Primer of biostatistics : statistical software program version 6.0 , 1981 .

[15]  Maryellen L. Giger,et al.  Ideal observer approximation using Bayesian classification neural networks , 2001, IEEE Transactions on Medical Imaging.

[16]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[17]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[18]  C. Metz,et al.  Statistical significance tests for binormal ROC curves , 1980 .

[19]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[20]  David Gur,et al.  Incorporating utility-weights when comparing two diagnostic systems: a preliminary assessment. , 2005, Academic radiology.

[21]  Nancy A Obuchowski,et al.  Estimating and comparing diagnostic tests' accuracy when the gold standard is not binary. , 2005, Academic radiology.

[22]  C E Metz,et al.  The "proper" binormal model: parametric receiver operating characteristic curve estimation with degenerate data. , 1997, Academic radiology.

[23]  C. R. Rao,et al.  Linear Statistical Inference and its Applications , 1968 .

[24]  C. Metz,et al.  "Proper" Binormal ROC Curves: Theory and Maximum-Likelihood Estimation. , 1999, Journal of mathematical psychology.

[25]  C E Metz,et al.  Variance-component modeling in the analysis of receiver operating characteristic index estimates. , 1997, Academic radiology.

[26]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[27]  J A Swets,et al.  Form of empirical ROCs in discrimination and diagnostic tasks: implications for theory and measurement of performance. , 1986, Psychological bulletin.

[28]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[29]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.

[30]  J. Hanley The Robustness of the "Binormal" Assumptions Used in Fitting ROC Curves , 1988, Medical decision making : an international journal of the Society for Medical Decision Making.

[31]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[32]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[33]  A. Toledano,et al.  Ordinal regression methodology for ROC curves derived from correlated data. , 1996, Statistics in medicine.

[34]  C. D. Meyer,et al.  Generalized inverses of linear transformations , 1979 .

[35]  R. F. Wagner,et al.  Assessment of medical imaging and computer-assist systems: lessons from recent experience. , 2002, Academic radiology.

[36]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[37]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[38]  Brandon D Gallas,et al.  One-shot estimate of MRMC variance: AUC. , 2006, Academic radiology.