A crossed random effects modeling approach for estimating diagnostic accuracy from ordinal ratings without a gold standard

In diagnostic studies without a gold standard, the assumption on the dependence structure of the multiple tests or raters plays an important role in model performance. In case of binary disease status, both conditional independence and crossed random effects structure have been proposed and their performance investigated. Less attention has been paid to the situation where the true disease status is ordinal. In this paper, we propose crossed subject-specific and rater-specific random effects to account for the dependence structure and assess the robustness of the proposed model to misspecification in the random effects distributions. We applied the models to data from the Physician Reliability Study, which focuses on assessing the diagnostic accuracy in a population of raters for the staging of endometriosis, a gynecological disorder in women. Using this new methodology, we estimate the probability of a correct classification and show that regional experts can more easily classify the intermediate stage than resident physicians.

[1]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[2]  P. Albert,et al.  A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[3]  S. Walter,et al.  Estimating the error rates of diagnostic tests. , 1980, Biometrics.

[4]  S. Hui,et al.  Evaluation of diagnostic tests without gold standards , 1998, Statistical methods in medical research.

[5]  Xiao-Hua Zhou,et al.  Random effects models for assessing diagnostic accuracy of traditional Chinese doctors in absence of a gold standard , 2012, Statistics in medicine.

[6]  Xiao-Hua Zhou,et al.  Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard , 2005, Biometrics.

[7]  Miqu Wang,et al.  Evaluation of diagnostic accuracy in detecting ordered symptom statuses without a gold standard. , 2011, Biostatistics.

[8]  Wesley O Johnson,et al.  Identifiability of Models for Multiple Diagnostic Testing in the Absence of a Gold Standard , 2010, Biometrics.

[9]  Paul S Albert,et al.  Estimating Diagnostic Accuracy of Raters Without a Gold Standard by Exploiting a Group of Experts , 2012, Biometrics.

[10]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[11]  Bo Zhang,et al.  Interrater and Intrarater Reliability in the Diagnosis and Staging of Endometriosis , 2012, Obstetrics and gynecology.

[12]  M. Tan,et al.  Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. , 1996, Biometrics.

[13]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[14]  Scott L. Zeger,et al.  Latent Variable Regression for Multiple Discrete Outcomes , 1997 .