Gold standards are out and Bayes is in: Implementing the cure for imperfect reference tests in diagnostic accuracy studies.

Bayesian mixture models, often termed latent class models, allow users to estimate the diagnostic accuracy of tests and true prevalence in one or more populations when the positive and/or negative reference standards are imperfect. Moreover, they allow the data analyst to show the superiority of a novel test over an old test, even if this old test is the (imperfect) reference standard. We use published data on Toxoplasmosis in pigs to explore the effects of numbers of tests, numbers of populations, and dependence structure among tests to ensure model (local) identifiability. We discuss and make recommendations about use of priors, sensitivity analysis, model identifiability and study design options, and strongly argue for the use of Bayesian mixture models as a logical and coherent approach for estimating the diagnostic accuracy of two or more tests.

[1]  L. Joseph,et al.  Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests , 2001, Biometrics.

[2]  Wesley O Johnson,et al.  On the interpretation of test sensitivity in the two-test two-population problem: assumptions matter. , 2009, Preventive veterinary medicine.

[3]  Joseph L. Gastwirth,et al.  Bayesian analysis of screening data: Application to AIDS in blood donors , 1991 .

[4]  P M Vacek,et al.  The effect of conditional dependence on the evaluation of diagnostic tests. , 1985, Biometrics.

[5]  J. Reitsma,et al.  Latent class models in diagnostic studies when there is no reference standard--a systematic review. , 2014, American journal of epidemiology.

[6]  Søren Højsgaard,et al.  Diagnosing diagnostic tests: evaluating the assumptions underlying the estimation of sensitivity and specificity in the absence of a gold standard. , 2005, Preventive veterinary medicine.

[7]  Wesley O Johnson,et al.  STARD-BLCM: Standards for the Reporting of Diagnostic accuracy studies that use Bayesian Latent Class Models. , 2017, Preventive veterinary medicine.

[8]  Andrew Thomas,et al.  The BUGS project: Evolution, critique and future directions , 2009, Statistics in medicine.

[9]  H. Gamble,et al.  Comparison of a commercial ELISA with the modified agglutination test for detection of Toxoplasma infection in the domestic pig. , 2005, Veterinary parasitology.

[10]  H Stryhn,et al.  Conditional dependence between tests affects the diagnosis and surveillance of animal diseases. , 2000, Preventive veterinary medicine.

[11]  Wesley O. Johnson,et al.  Correlation‐adjusted estimation of sensitivity and specificity of two diagnostic tests , 2003 .

[12]  Pat McInturff,et al.  Modelling risk when binary outcomes are subject to error , 2004, Statistics in medicine.

[13]  I A Gardner,et al.  Log-linear and logistic modeling of dependence among diagnostic tests. , 2000, Preventive veterinary medicine.

[14]  D. Lindsay,et al.  Effect of high temperature on infectivity of Toxoplasma gondii tissue cysts in pork. , 1990, The Journal of parasitology.

[15]  R. Christensen,et al.  A New Perspective on Priors for Generalized Linear Models , 1996 .

[16]  W O Johnson,et al.  Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. , 2000, Preventive veterinary medicine.

[17]  J. Dubey,et al.  Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. , 1995, American journal of veterinary research.

[18]  Edward J. Bedrick,et al.  Bayesian Binomial Regression: Predicting Survival at a Trauma Center , 1997 .

[19]  Emmanuel Lesaffre,et al.  Bayesian latent class models with conditionally dependent diagnostic tests: A case study , 2008, Statistics in medicine.

[20]  W. Johnson,et al.  Bayesian Methods in Public Health , 2017 .

[21]  M. Greiner,et al.  Statistical Evaluation of Test Accuracy Studies for Toxoplasma gondii in Food Animal Intermediate Hosts , 2010, Zoonoses and public health.

[22]  Ronald Christensen,et al.  Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians , 2010 .

[23]  W. Johnson,et al.  Diagnostic Performance Tests for Suspected Scaphoid Fractures Differ with Conventional and Latent Class Analysis , 2011, Clinical orthopaedics and related research.

[24]  P. Albert,et al.  A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[25]  Wesley O Johnson,et al.  Identifiability of Models for Multiple Diagnostic Testing in the Absence of a Gold Standard , 2010, Biometrics.

[26]  J. Dubey,et al.  Toxoplasma gondii in Iowa sows: comparison of antibody titers to isolation of T. gondii by bioassays in mice and cats. , 1995, The Journal of parasitology.

[27]  I A Gardner,et al.  Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. , 2005, Preventive veterinary medicine.

[28]  P. Gustafson On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables , 2005 .

[29]  Michelle Norris,et al.  Modeling bivariate longitudinal diagnostic outcome data in the absence of a gold standard , 2009 .

[30]  W O Johnson,et al.  Screening without a "gold standard": the Hui-Walter paradigm revisited. , 2001, American journal of epidemiology.

[31]  S. Walter,et al.  Estimating the error rates of diagnostic tests. , 1980, Biometrics.

[32]  D. Lindsay,et al.  Structures of Toxoplasma gondiiTachyzoites, Bradyzoites, and Sporozoites and Biology and Development of Tissue Cysts , 1998, Clinical Microbiology Reviews.

[33]  B. Gladen,et al.  Estimating prevalence from the results of a screening test. , 1978, American journal of epidemiology.

[34]  Wesley O. Johnson,et al.  Hierarchical models for estimating herd prevalence and test accuracy in the absence of a gold standard , 2003 .