On implementation of the Gibbs sampler for estimating the accuracy of multiple diagnostic tests

Implementation of the Gibbs sampler for estimating the accuracy of multiple binary diagnostic tests in one population has been investigated. This method, proposed by Joseph, Gyorkos and Coupal, makes use of a Bayesian approach and is used in the absence of a gold standard to estimate the prevalence, the sensitivity and specificity of medical diagnostic tests. The expressions that allow this method to be implemented for an arbitrary number of tests are given. By using the convergence diagnostics procedure of Raftery and Lewis, the relation between the number of iterations of Gibbs sampling and the precision of the estimated quantiles of the posterior distributions is derived. An example concerning a data set of gastro-esophageal reflux disease patients collected to evaluate the accuracy of the water siphon test compared with 24 h pH-monitoring, endoscopy and histology tests is presented. The main message that emerges from our analysis is that implementation of the Gibbs sampler to estimate the parameters of multiple binary diagnostic tests can be critical and convergence diagnostic is advised for this method. The factors which affect the convergence of the chains to the posterior distributions and those that influence the precision of their quantiles are analyzed.

[1]  S. Hui,et al.  Evaluation of diagnostic tests without gold standards , 1998, Statistical methods in medical research.

[2]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[3]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[4]  L. Joseph,et al.  Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests , 2001, Biometrics.

[5]  Gareth O. Roberts,et al.  Convergence assessment techniques for Markov chain Monte Carlo , 1998, Stat. Comput..

[6]  M. Vieth,et al.  High prevalence of gastroesophageal reflux symptoms and esophagitis with or without symptoms in the general adult Swedish population: A Kalixanda study report , 2005, Scandinavian journal of gastroenterology.

[7]  A R Zinsmeister,et al.  Prevalence and clinical spectrum of gastroesophageal reflux: a population-based study in Olmsted County, Minnesota. , 1997, Gastroenterology.

[8]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[9]  I A Gardner,et al.  Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. , 2005, Preventive veterinary medicine.

[10]  W O Johnson,et al.  Screening without a "gold standard": the Hui-Walter paradigm revisited. , 2001, American journal of epidemiology.

[11]  L. Joseph,et al.  Robustness of Prevalence Estimates Derived from Misclassified Data from Administrative Databases , 2007, Biometrics.

[12]  P. Kahrilas Gastroesophageal reflux disease. , 1996, JAMA.

[13]  A. Raftery,et al.  How Many Iterations in the Gibbs Sampler , 1991 .

[14]  S. E. Hills,et al.  Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling , 1990 .

[15]  J. Richter Gastrooesophageal reflux disease. , 2007, Best practice & research. Clinical gastroenterology.

[16]  R. Orlando,et al.  Gastroesophageal reflux disease , 2001, Current opinion in gastroenterology.

[17]  S D Walter,et al.  Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review. , 1988, Journal of clinical epidemiology.

[18]  L. Joseph,et al.  Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. , 1995, American journal of epidemiology.

[19]  P. Moayyedi,et al.  New approaches to enhance the accuracy of the diagnosis of reflux disease , 2004, Gut.

[20]  Margaret Sullivan Pepe,et al.  Insights into latent class analysis of diagnostic test performance. , 2007, Biostatistics.

[21]  C. Pellegrini,et al.  Pharyngeal pH measurements in patients with respiratory symptoms before and during proton pump inhibitor therapy. , 2001, American journal of surgery.

[22]  P N Valenstein,et al.  Evaluating diagnostic tests with imperfect standards. , 1990, American journal of clinical pathology.

[23]  S. E. Ahmed,et al.  Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference , 2008, Technometrics.

[24]  Paolo Vicini,et al.  Assessing Convergence of Markov Chain Monte Carlo Simulations in Hierarchical Bayesian Models for Population Pharmacokinetics , 2004, Annals of Biomedical Engineering.