Effect of dependent errors in the assessment of diagnostic or screening test accuracy when the reference standard is imperfect

When no gold standard is available to evaluate a diagnostic or screening test, as is often the case, an imperfect reference standard test must be used instead. Furthermore, the errors of the test and its reference standard may not be independent. Some authors have opined that positively dependent errors will lead to overestimation of test performance. Although positive dependence does increase agreement between the test and the reference standard, it is not clear if test accuracy will necessarily be overestimated in this situation, and the case of negatively associated test errors is even less clear. To examine this issue in more detail, we derive the apparent sensitivity, specificity, and overall accuracy of a test relative to an imperfect reference standard and the bias in these parameters. We demonstrate that either positive or negative bias can occur if the reference standard is imperfect. The type and magnitude of bias depend on several components: the disease prevalence, the true test sensitivity and specificity, the covariance between the false‐negative test errors among the true disease cases, and the covariance between the false‐positive test errors among the true noncases. If, for example, sensitivity and specificity are 0.8 for both the test and reference standard and the errors have a moderate positive dependence, test sensitivity is then underestimated at low prevalence but overestimated at high prevalence, while the opposite occurs for specificity. We illustrate these ideas through general numerical calculations and an empirical example of screening for breast cancer with magnetic resonance imaging and mammography. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  Huiping Xu,et al.  A Probit Latent Class Model with General Correlation Structures for Evaluating Accuracy of Diagnostic Tests , 2009, Biometrics.

[2]  Johannes B. Reitsma,et al.  A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. , 2009, Journal of clinical epidemiology.

[3]  Thomas A Louis,et al.  Random Effects Models in a Meta-Analysis of the Accuracy of Two Diagnostic Tests Without a Gold Standard , 2009, Journal of the American Statistical Association.

[4]  Paul S Albert,et al.  Estimating diagnostic accuracy of multiple binary tests with an imperfect reference standard , 2009, Statistics in medicine.

[5]  Patrick M M Bossuyt,et al.  Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis. , 2009, Journal of clinical epidemiology.

[6]  P. Glasziou,et al.  When Should a New Test Become the Current Reference Standard? , 2008, Annals of Internal Medicine.

[7]  N Houssami,et al.  A systematic review of the effectiveness of magnetic resonance imaging (MRI) as an addition to mammography and ultrasound in screening young women at high risk of breast cancer. , 2007, European journal of cancer.

[8]  S. Walter Diagnostic Test Accuracy , 2005 .

[9]  P. Grenier,et al.  Hui and Walter's latent-class reference-free approach may be more useful in assessing agreement than diagnostic performance. , 2005, Journal of clinical epidemiology.

[10]  A R Padhani,et al.  Screening with magnetic resonance imaging and mammography of a UK population at high familial risk of breast cancer: a prospective multicentre cohort study (MARIBS) , 2005, The Lancet.

[11]  Ellen Warner,et al.  Surveillance of BRCA1 and BRCA2 mutation carriers with magnetic resonance imaging, ultrasound, mammography, and clinical breast examination , 2004, JAMA.

[12]  P. Albert,et al.  A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[13]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[14]  A. Hadgu,et al.  A biomedical application of latent class models with random effects , 2002 .

[15]  S. Hui,et al.  Evaluation of diagnostic tests without gold standards , 1998, Statistical methods in medical research.

[16]  S D Walter,et al.  Effects of dependent errors in the assessment of diagnostic test performance. , 1997, Statistics in medicine.

[17]  I Yang,et al.  Latent variable modeling of diagnostic accuracy. , 1997, Biometrics.

[18]  M. Tan,et al.  Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. , 1996, Biometrics.

[19]  S D Walter,et al.  Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review. , 1988, Journal of clinical epidemiology.

[20]  M Staquet,et al.  Methodology for the assessment of new dichotomous diagnostic tests. , 1981, Journal of chronic diseases.

[21]  J. Gart,et al.  Comparison of a screening test and a reference test in epidemiologic studies. II. A probabilistic model for the comparison of diagnostic tests. , 1966, American journal of epidemiology.