Continuous versus categorical data for ROC analysis: some quantitative considerations.

RATIONALE AND OBJECTIVES Several authors have encouraged the use of a quasi-continuous rating scale for data collection in receiver operating characteristic (ROC) curve analysis of diagnostic modalities, rather than rating scales based on five to seven ordinal categories or levels of suspicion. Although many investigators have gone over to this method, a discussion of the issues continues. The present work provides a quantitative analysis from the viewpoint of measurement science. MATERIALS AND METHODS A simple model of the effect of data discretization or quantization on the measurement of the variance of noisy data was developed. Then Monte Carlo simulations of multiple-reader, multiple-case ROC experiments were performed and analyzed in terms of components-of-variance models to investigate the effect of data quantization in that more complex setting. RESULTS For single-reader studies, discretization into five categories can reduce the precision of ROC measurements by a large amount. The effect may be attenuated in multireader studies. CONCLUSION More precise measurements of diagnostic detection performance and thus more efficient use of resources are served by good measurement methods. These are promoted by the use of a quasi-continuous rating scale in ROC studies.

[1]  K S Berbaum,et al.  Degeneracy and discrete receiver operating characteristic rating data. , 1995, Academic radiology.

[2]  K S Berbaum,et al.  A contaminated binormal model for ROC data: Part III. Initial evaluation with detection ROC data. , 2000, Academic radiology.

[3]  C E Metz,et al.  Variance-component modeling in the analysis of receiver operating characteristic index estimates. , 1997, Academic radiology.

[4]  R. F. Wagner,et al.  Study design in the evaluation of breast cancer imaging technologies. , 2000, Academic radiology.

[5]  K. Berbaum,et al.  Proper receiver operating characteristic analysis: the bigamma model. , 1997, Academic radiology.

[6]  M S Pepe,et al.  Three approaches to regression analysis of receiver operating characteristic curves for continuous test results. , 1998, Biometrics.

[7]  N. Petrick,et al.  Digital mammography: observer performance study of the effects of pixel size on the characterization of malignant and benign microcalcifications. , 2001, Academic radiology.

[8]  M. Kallergi,et al.  Simulation model of mammographic calcifications based on the American College of Radiology Breast Imaging Reporting and Data System, or BIRADS. , 1998, Academic radiology.

[9]  C. Metz,et al.  "Proper" Binormal ROC Curves: Theory and Maximum-Likelihood Estimation. , 1999, Journal of mathematical psychology.

[10]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[11]  Kevin S. Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: split-plot experimental design , 1999, Medical Imaging.

[12]  H E Rockette,et al.  The use of continuous and discrete confidence judgments in receiver operating characteristic studies of diagnostic imaging techniques. , 1992, Investigative radiology.

[13]  J. Swets,et al.  Enhanced interpretation of diagnostic images. , 1988, Investigative radiology.

[14]  C E Metz,et al.  The "proper" binormal model: parametric receiver operating characteristic curve estimation with degenerate data. , 1997, Academic radiology.

[15]  J R Fielding,et al.  Original smooth receiver operating characteristic curve estimation from continuous data: statistical methods for analyzing the predictive value of spiral CT of ureteral stones. , 1998, Academic radiology.

[16]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[17]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[18]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[19]  Kunio Doi,et al.  Recent Developments in Digital Imaging , 1985 .

[20]  Kevin S. Berbaum,et al.  A contaminated binormal model for ROC data , 2000 .

[21]  Berkman Sahiner,et al.  Digital mammography: observer performance study of the effects of pixel size on radiologists' characterization of malignant and benign microcalcifications , 1999, Medical Imaging.

[22]  N A Obuchowski,et al.  Computing Sample Size for Receiver Operating Characteristic Studies , 1994, Investigative radiology.

[23]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[24]  H E Rockette,et al.  On the validity of the continuous and discrete confidence rating scales in receiver operating characteristic studies. , 1993, Investigative radiology.

[25]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[26]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.