A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

We develop a Bayesian approach to sample size and power calculations for cross-sectional studies that are designed to evaluate and compare continuous medical tests. For studies that involve one test or two conditionally independent or dependent tests, we present methods that are applicable when the true disease status of sampled individuals will be available and when it will not. Within a hypothesis testing framework, we consider the goal of demonstrating that a medical test has area under the receiver operating characteristic (ROC) curve that exceeds a minimum acceptable level or another relevant threshold, and the goals of establishing the superiority or equivalence of one test relative to another. A Bayesian average power criterion is used to determine a sample size that will yield high posterior probability, on average, of a future study correctly deciding in favor of these goals. The impacts on Bayesian average power of prior distributions, the proportion of diseased subjects in the study, and correlation among tests are investigated through simulation. The computational algorithm we develop involves simulating multiple data sets that are fit with Bayesian models using Gibbs sampling, and is executed by using WinBUGS in tandem with R.

[1]  Nancy A Obuchowski,et al.  An ROC‐type measure of diagnostic accuracy when the gold standard is continuous‐scale , 2006, Statistics in medicine.

[2]  Paul S Albert,et al.  Random Effects Modeling Approaches for Estimating ROC Curves from Repeated Ordinal Tests without a Gold Standard , 2007, Biometrics.

[3]  Adam J Branscum,et al.  Bayesian approach to average power calculations for binary regression models with misclassified outcomes , 2009, Statistics in medicine.

[4]  N A Obuchowski,et al.  Sample size calculations in studies of test accuracy , 1998, Statistical methods in medical research.

[5]  Fei Wang,et al.  A simulation-based approach to Bayesian sample size determination for performance under a given model and for separating models , 2002 .

[6]  Xiao-Hua Zhou,et al.  Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard , 2005, Biometrics.

[7]  L. Joseph,et al.  Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. , 1995, American journal of epidemiology.

[8]  Alaattin Erkanli,et al.  Bayesian semi‐parametric ROC analysis , 2006, Statistics in medicine.

[9]  N A Obuchowski,et al.  Sample size determination for diagnostic accuracy studies involving binormal ROC curve indices. , 1997, Statistics in medicine.

[10]  Wesley O. Johnson,et al.  Bayesian inferences for receiver operating characteristic curves in the absence of a gold standard , 2006 .

[11]  N A Obuchowski,et al.  Nonparametric analysis of clustered ROC curve data. , 1997, Biometrics.

[12]  Adam J. Branscum,et al.  Sample size calculations for studies designed to evaluate diagnostic test accuracy , 2007 .

[13]  W O Johnson,et al.  Screening without a "gold standard": the Hui-Walter paradigm revisited. , 2001, American journal of epidemiology.

[14]  Wesley O Johnson,et al.  Bayesian semiparametric ROC curve estimation and disease diagnosis , 2008, Statistics in medicine.

[15]  Constantine A Gatsonis,et al.  Hierarchical models for ROC curve summary measures: Design and analysis of multi‐reader, multi‐modality studies of medical tests , 2008, Statistics in medicine.

[16]  Jen‐pei Liu,et al.  Tests of equivalence and non‐inferiority for diagnostic accuracy based on the paired areas under ROC curves , 2006, Statistics in medicine.

[17]  I A Gardner,et al.  Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. , 2005, Preventive veterinary medicine.