Accounting for Nonignorable Verification Bias in Assessment of Diagnostic Tests

A "gold" standard test, providing definitive verification of disease status, may be quite invasive or expensive. Current technological advances provide less invasive, or less expensive, diagnostic tests. Ideally, a diagnostic test is evaluated by comparing it with a definitive gold standard test. However, the decision to perform the gold standard test to establish the presence or absence of disease is often influenced by the results of the diagnostic test, along with other measured, or not measured, risk factors. If only data from patients who received the gold standard test were used to assess the test performance, the commonly used measures of diagnostic test performance--sensitivity and specificity--are likely to be biased. Sensitivity would often be higher, and specificity would be lower, than the true values. This bias is called verification bias. Without adjustment for verification bias, one may possibly introduce into the medical practice a diagnostic test with apparent, but not truly, high sensitivity. In this article, verification bias is treated as a missing covariate problem. We propose a flexible modeling and computational framework for evaluating the performance of a diagnostic test, with adjustment for nonignorable verification bias. The presented computational method can be utilized with any software that can repetitively use a logistic regression module. The approach is likelihood-based, and allows use of categorical or continuous covariates. An explicit formula for the observed information matrix is presented, so that one can easily compute standard errors of estimated parameters. The methodology is illustrated with a cardiology data example. We perform a sensitivity analysis of the dependency of verification selection process on disease.

[1]  M. Green The effect of validation group bias on screening tests for coronary artery disease. , 1985, Statistics in medicine.

[2]  C. Begg,et al.  Advances in statistical methodology for diagnostic medicine in the 1980's. , 1991, Statistics in medicine.

[3]  R. Little Pattern-Mixture Models for Multivariate Incomplete Data , 1993 .

[4]  S G Baker,et al.  Evaluating multiple diagnostic tests with partial verification. , 1995, Biometrics.

[5]  H. Kundel,et al.  The Effect of Verification on the Assessment of Imaging Techniques , 1983, Investigative radiology.

[6]  R. Pettigrew,et al.  The importance of work-up (verification) bias correction in assessing the accuracy of SPECT thallium-201 testing for the diagnosis of coronary artery disease. , 1996, Journal of clinical epidemiology.

[7]  "Work-up bias". , 1993, Journal of clinical epidemiology.

[8]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete contingency tables: the Slovenian plebiscite case , 2001 .

[9]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[10]  G. Diamond How accurate is SPECT thallium scintigraphy? , 1990, Journal of the American College of Cardiology.

[11]  D S Berman,et al.  A model for assessing the sensitivity and specificity of tests subject to selection bias. Application to exercise radionuclide ventriculography for diagnosis of coronary artery disease. , 1986, Journal of chronic diseases.

[12]  K. Liang,et al.  Adjustment for non-differential misclassification error in the generalized linear model. , 1991, Statistics in medicine.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  D Scharfstein,et al.  Methods for Conducting Sensitivity Analysis of Trials with Potentially Nonignorable Competing Causes of Censoring , 2001, Biometrics.

[15]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[16]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[17]  D. Berman,et al.  The declining specificity of exercise radionuclide ventriculography. , 1983, The New England journal of medicine.

[18]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .

[19]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[20]  Xiao-Hua Zhou,et al.  Maximum likelihood estimators of sensitivity and specificity corrected for verification bias , 1993 .

[21]  C B Begg,et al.  Biases in the assessment of diagnostic tests. , 1987, Statistics in medicine.

[22]  Nan M. Laird,et al.  Regression Analysis for Categorical Variables with Outcome Subject to Nonignorable Nonresponse , 1988 .

[23]  Diamond's Correction Method-A Real Gem or Just Cubic Zirconium? , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[24]  G. Diamond Reverend Bayes' silent majority. An alternative factor affecting sensitivity and specificity of exercise electrocardiography. , 1986, The American journal of cardiology.

[25]  X H Zhou,et al.  Effect of verification bias on positive and negative predictive values. , 1994, Statistics in medicine.

[26]  J W Hogan,et al.  Reparameterizing the Pattern Mixture Model for Sensitivity Analyses Under Informative Dropout , 2000, Biometrics.

[27]  R A Greenes,et al.  Assessment of diagnostic tests when disease verification is subject to selection bias. , 1983, Biometrics.

[28]  J. Yerushalmy Statistical problems in assessing methods of medical diagnosis, with special reference to X-ray techniques. , 1947, Public health reports.

[29]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[30]  J. Knottnerus The Effects of Disease Verification and Referral on the Relationship Between Symptoms and Diseases , 1987, Medical decision making : an international journal of the Society for Medical Decision Making.

[31]  T. Rothenberg Identification in Parametric Models , 1971 .