A permutation test sensitive to differences in areas for comparing ROC curves from a paired design

The area under the receiver operating characteristic (ROC) curve (AUC) is a widely accepted summary index of the overall performance of diagnostic procedures and the difference between AUCs is often used when comparing two diagnostic systems. We developed an exact non-parametric statistical procedure for comparing two ROC curves in paired design settings. The test which is based on all permutations of the subject specific rank ratings is formally a test for equality of ROC curves that is sensitive to the alternatives of AUC difference. The operating characteristics of the proposed test were evaluated using extensive simulations over a wide range of parameters. The proposed procedure can be easily implemented in experimental ROC data sets. For small samples and for underlying parameters that are common in experimental studies in diagnostic imaging the test possesses good operating characteristics and is more powerful than the conventional non-parametric procedure for AUC comparisons. We also derived an asymptotic version of the test which uses an exact estimate of the variance in the permutation space and provides a good approximation even when the sample sizes are small. This asymptotic procedure is a simple and precise approximation to the exact test and is useful for large sample sizes where the exact test may be computationally burdensome.

[1]  C. Metz,et al.  A New Approach for Testing the Significance of Differences Between ROC Curves Measured from Correlated Data , 1984 .

[2]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[3]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[4]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[5]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[6]  S. Greenhouse,et al.  The evaluation of diagnostic tests. , 1950, Biometrics.

[7]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[8]  J. N. Arvesen Jackknifing U-statistics , 1968 .

[9]  H E Rockette,et al.  Empiric assessment of parameters that affect the design of multireader receiver operating characteristic studies. , 1999, Academic radiology.

[10]  J. Hilden The Area under the ROC Curve and Its Competitors , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[11]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[12]  E. Venkatraman,et al.  A Permutation Test to Compare Receiver Operating Characteristic Curves , 2000, Biometrics.

[13]  R. F. Wagner,et al.  Assessment of medical imaging and computer-assist systems: lessons from recent experience. , 2002, Academic radiology.

[14]  Chuhsing Kate Hsiao,et al.  Alternative Summary Indices for the Receiver Operating Characteristic Curve , 1996, Epidemiology.

[15]  E. S. Venkatraman,et al.  A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment , 1996 .

[16]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[17]  G. Campbell,et al.  Advances in statistical methodology for the evaluation of diagnostic and laboratory tests. , 1994, Statistics in medicine.

[18]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[19]  W. Hoeffding A Class of Statistics with Asymptotically Normal Distribution , 1948 .

[20]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[21]  J. Hanley The Robustness of the "Binormal" Assumptions Used in Fitting ROC Curves , 1988, Medical decision making : an international journal of the Society for Medical Decision Making.