Concerns about composite reference standards in diagnostic research

Composite reference standards are used to evaluate the accuracy of a new test in the absence of a perfect reference test. A composite reference standard defines a fixed, transparent rule to classify subjects into disease positive and disease negative groups based on existing imperfect tests. The accuracy of the composite reference standard itself has received limited attention. We show that increasing the number of tests used to define a composite reference standard can worsen its accuracy, leading to underestimation or overestimation of the new test’s accuracy. Further, estimates based on composite reference standards vary with disease prevalence, indicating that they may not be comparable across studies. These problems can be attributed to the fact that composite reference standards make a simplistic classification and then ignore the uncertainty in this classification. Latent class models that adjust for the accuracy of the different imperfect tests and the dependence between them should be pursued to make better use of data

[1]  Shabir Madhi,et al.  Evaluation of tuberculosis diagnostics in children: 1. Proposed clinical case definitions for classification of intrathoracic tuberculosis disease. Consensus from an expert panel. , 2012, The Journal of infectious diseases.

[2]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[3]  P. Albert,et al.  A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[4]  P. Rice,et al.  Limitations of screening tests for the detection of Chlamydia trachomatis in asymptomatic adolescent and young adult women. , 2004, American journal of obstetrics and gynecology.

[5]  Lin Sun,et al.  Rapid Diagnosis of Childhood Pulmonary Tuberculosis by Xpert MTB/RIF Assay Using Bronchoalveolar Lavage Fluid , 2014, BioMed research international.

[6]  S D Walter,et al.  Effects of dependent errors in the assessment of diagnostic test performance. , 1997, Statistics in medicine.

[7]  Timothy A. Green,et al.  Head-to-Head Multicenter Comparison of DNA Probe and Nucleic Acid Amplification Tests for Chlamydia trachomatis Infection in Women Performed with an Improved Reference Standard , 2002, Journal of Clinical Microbiology.

[8]  P. Bossuyt,et al.  Assessing the value of diagnostic tests: a framework for designing and evaluating trials , 2012, BMJ : British Medical Journal.

[9]  Johannes B Reitsma,et al.  Bias due to composite reference standards in diagnostic accuracy studies , 2016, Statistics in medicine.

[10]  M S Pepe,et al.  Using a combination of reference tests to assess the accuracy of a new diagnostic test. , 1999, Statistics in medicine.

[11]  P. Qiu The Statistical Evaluation of Medical Tests for Classification and Prediction , 2005 .

[12]  A. Hadgu,et al.  Evaluation of Nucleic Acid Amplification Tests in the Absence of a Perfect Gold-Standard Test: A Review of the Statistical and Epidemiologic Issues , 2005, Epidemiology.

[13]  M. Tameris,et al.  Structured approaches for the screening and diagnosis of childhood tuberculosis in a high prevalence region of South Africa. , 2010, Bulletin of the World Health Organization.

[14]  Johannes B. Reitsma,et al.  A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. , 2009, Journal of clinical epidemiology.

[15]  Donald M Yealy,et al.  Methodologic standards for interpreting clinical decision rules in emergency medicine: 2014 update. , 2014, Annals of emergency medicine.

[16]  N. Dendukuri,et al.  Evaluating the accuracy and economic value of a new test in the absence of a perfect reference test , 2017, Research synthesis methods.

[17]  William W. Thompson,et al.  Utility of Composite Reference Standards and Latent Class Analysis in Evaluating the Clinical Accuracy of Diagnostic Tests for Pertussis , 2007, Clinical and Vaccine Immunology.

[18]  D. Meško Bronchoalveolar Lavage Fluid , 2002 .

[19]  Maarten van Smeden,et al.  Diagnostic Test Accuracy in Childhood Pulmonary Tuberculosis: A Bayesian Latent Class Analysis. , 2016, American journal of epidemiology.

[20]  Marco Schito,et al.  Clinical Case Definitions for Classification of Intrathoracic Tuberculosis in Children: An Update. , 2015, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.