Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard

In the evaluation of diagnostic accuracy of tests, a gold standard on the disease status is required. However, in many complex diseases, it is impossible or unethical to obtain such a gold standard. If an imperfect standard is used, the estimated accuracy of the tests would be biased. This type of bias is called imperfect gold standard bias. In this article we develop a nonparametric maximum likelihood method for estimating ROC curves and their areas of ordinal-scale tests in the absence of a gold standard. Our simulation study shows that the proposed estimators for the ROC curve areas have good finite-sample properties in terms of bias and mean squared error. Further simulation studies show that our nonparametric approach is comparable to the binormal parametric method, and is easier to implement. Finally, we illustrate the application of the proposed method in a real clinical study on assessing the accuracy of seven specific pathologists in detecting carcinoma in situ of the uterine cervix.

[1]  S. Walsh,et al.  Limitations to the robustness of binormal ROC curves: effects of model misspecification and location of decision thresholds on bias, precision, size and power. , 1997, Statistics in medicine.

[2]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[3]  Yinsheng Qu,et al.  A Model for Evaluating Sensitivity and Specificity for Correlated Diagnostic Tests in Efficacy Studies with an Imperfect Reference Test , 1998 .

[4]  N D Holmquist,et al.  Variability in classification of carcinoma in situ of the uterine cervix. , 1967, Archives of pathology.

[5]  S. Walter,et al.  Estimating the error rates of diagnostic tests. , 1980, Biometrics.

[6]  Edward H. Ip,et al.  Stochastic EM: method and application , 1996 .

[7]  M S Pepe,et al.  Using a combination of reference tests to assess the accuracy of a new diagnostic test. , 1999, Statistics in medicine.

[8]  A. Hadgu,et al.  A biomedical application of latent class models with random effects , 2002 .

[9]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[10]  Xiao-Hua Zhou,et al.  NONPARAMETRIC ESTIMATION OF COMPONENT DISTRIBUTIONS IN A MULTIVARIATE MIXTURE , 2003 .

[11]  C B Begg,et al.  Consensus Diagnoses and "Gold Standards" , 1990, Medical decision making : an international journal of the Society for Medical Decision Making.

[12]  S. Hui,et al.  Evaluation of diagnostic tests without gold standards , 1998, Statistical methods in medical research.

[13]  J A Hanley,et al.  A Comparison of Parametric and Nonparametric Approaches to ROC Analysis of Quantitative Diagnostic Tests , 1997, Medical decision making : an international journal of the Society for Medical Decision Making.

[14]  W Miller,et al.  Using a combination of reference tests to assess the accuracy of a diagnostic test by A. Alonzo and M. Pepe, Statistics in Medicine 1999; 18: 2987–3003 , 2001, Statistics in medicine.

[15]  J. R. Landis,et al.  An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. , 1977, Biometrics.

[16]  M. Bronskill,et al.  Receiver Operator characteristic (ROC) Analysis without Truth , 1990, Medical decision making : an international journal of the Society for Medical Decision Making.