Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations

A method is proposed for comparing the accuracy of diagnostic tests that rely on a reader's subjective interpretation of the results. An ANOVA approach is applied where the dependencies between the readers' estimates of diagnostic accuracy are handled by adjusting the usual F statistic for theestimated correlation. The distribution of the resulting test statistic is evaluated. The problem is particularly relevant to diagnostic radiology where multi-reader studies of the indices associated with the ROC curve serve important roles in evaluating the efficacy of diagnostic tests.

[1]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[2]  Dennis G. Fryback,et al.  The Efficacy of Diagnostic Imaging , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[3]  H E Rockette,et al.  Receiver operating characteristic analysis of chest image interpretation with conventional, laser-printed, and high-resolution workstation images. , 1990, Radiology.

[4]  C. Metz,et al.  A New Approach for Testing the Significance of Differences Between ROC Curves Measured from Correlated Data , 1984 .

[5]  J. Walsh Concerning the Effect of Intraclass Correlation on Certain Significance Tests , 1947 .

[6]  W Zucchini,et al.  On the statistical analysis of ROC curves. , 1989, Statistics in medicine.

[7]  D. Dorfman,et al.  Maximum likelihood estimation of parameters of signal detection theory—A direct solution , 1968, Psychometrika.

[8]  J. Hanley Receiver operating characteristic (ROC) methodology: the state of the art. , 1989, Critical reviews in diagnostic imaging.

[9]  H E Rockette,et al.  Nonparametric estimation of degenerate ROC data sets used for comparison of imaging systems. , 1990, Investigative radiology.

[10]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[11]  J. Hilden The Area under the ROC Curve and Its Competitors , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[12]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.

[13]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[14]  R. Centor Signal Detectability , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[15]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[16]  Egon S. Pearson,et al.  Relation between the shape of population distribution and the robustness of four simple test statistics , 1975 .

[17]  L B Lusted,et al.  Radiographic applications of receiver operating characteristic (ROC) curves. , 1974, Radiology.

[18]  H. Scheffé,et al.  The Analysis of Variance , 1960 .

[19]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[20]  B. Bhat,et al.  On the Distribution of Certain Quadratic Forms in Normal Variates , 1962 .

[21]  Mitchell H. Gail,et al.  A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data , 1989 .

[22]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[23]  R. Pavur,et al.  Exact F Tests in an ANOVA Procedure for Dependent Observations. , 1984, Multivariate behavioral research.

[24]  C B Begg,et al.  Biases in the assessment of diagnostic tests. , 1987, Statistics in medicine.

[25]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[26]  C. Metz,et al.  Statistical significance tests for binormal ROC curves , 1980 .

[27]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[28]  Robert Pavur,et al.  Unbiased F-tests for factorial experiments with correlated data , 1983 .

[29]  Y. Hochberg,et al.  Estimating Pr(X, 1988 .

[30]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .