Correcting for bias in the selection and validation of informative diagnostic tests

When developing a new diagnostic test for a disease, there are often multiple candidate classifiers to choose from, and it is unclear if any will offer an improvement in performance compared with current technology. A two-stage design can be used to select a promising classifier (if one exists) in stage one for definitive validation in stage two. However, estimating the true properties of the chosen classifier is complicated by the first stage selection rules. In particular, the usual maximum likelihood estimator (MLE) that combines data from both stages will be biased high. Consequently, confidence intervals and p-values flowing from the MLE will also be incorrect. Building on the results of Pepe et al. (SIM 28:762–779), we derive the most efficient conditionally unbiased estimator and exact confidence intervals for a classifier's sensitivity in a two-stage design with arbitrary selection rules; the condition being that the trial proceeds to the validation stage. We apply our estimation strategy to data from a recent family history screening tool validation study by Walter et al. (BJGP 63:393–400) and are able to identify and successfully adjust for bias in the tool's estimated sensitivity to detect those at higher risk of breast cancer. © 2015 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

[1]  Nigel Stallard,et al.  A confirmatory seamless phase II/III clinical trial design incorporating short‐term endpoint information , 2010, Statistics in medicine.

[2]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[3]  Harold B. Sackrowitz,et al.  Two stage conditionally unbiased estimators of the selected mean , 1989 .

[4]  Allan R. Sampson,et al.  Drop-the-losers design: Binomial case , 2009, Comput. Stat. Data Anal..

[5]  Yi Lu,et al.  Novel diagnostic biomarkers for prostate cancer , 2010, Journal of Cancer.

[6]  Gabriel Capellá,et al.  DNA Methylation Biomarkers for Noninvasive Diagnosis of Colorectal Cancer , 2013, Cancer Prevention Research.

[7]  Werner Brannath,et al.  Shrinkage estimation in two‐stage adaptive designs with midtrial treatment selection , 2013, Statistics in medicine.

[8]  John Whitehead,et al.  An exact method for analysis following a two-stage phase II cancer clinical trial. , 2010, Statistics in medicine.

[9]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[10]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[11]  M H Gail,et al.  Validating and improving models for projecting the absolute risk of breast cancer. , 2001, Journal of the National Cancer Institute.

[12]  J. Emery,et al.  Development and evaluation of a brief self-completed family history screening tool for common chronic disease prevention in primary care. , 2013, The British journal of general practice : the journal of the Royal College of General Practitioners.

[13]  Korbinian Strimmer,et al.  Gene ranking and biomarker discovery under correlation , 2009, Bioinform..

[14]  Margaret Sullivan Pepe,et al.  Conditional estimation after a two-stage diagnostic biomarker study that allows early termination for futility. , 2012, Statistics in medicine.

[15]  Martin Posch,et al.  Testing and estimation in flexible group sequential designs with adaptive treatment selection , 2005, Statistics in medicine.

[16]  Sin-Ho Jung,et al.  On the estimation of the binomial probability in multistage clinical trials , 2004, Statistics in medicine.

[17]  Jack Bowden,et al.  Unbiased Estimation of Selected Treatment Means in Two‐Stage Trials , 2008, Biometrical journal. Biometrische Zeitschrift.

[18]  Allan R Sampson,et al.  Drop‐the‐Losers Design: Normal Case , 2005, Biometrical journal. Biometrische Zeitschrift.

[19]  Unbiased estimation of the parameter of a selected binomial population , 1992 .

[20]  Margaret Sullivan Pepe,et al.  Conditional estimation of sensitivity and specificity from a phase 2 biomarker study allowing early termination for futility , 2009, Statistics in medicine.

[21]  Joseph S Koopmeiners,et al.  Early termination of a two-stage study to develop and validate a panel of biomarkers. , 2013, Statistics in medicine.

[22]  Jack Bowden,et al.  Conditionally unbiased and near unbiased estimation of the selected treatment mean for multistage drop-the-losers trials , 2013, Biometrical journal. Biometrische Zeitschrift.

[23]  Nigel Stallard,et al.  Sequential designs for phase III clinical trials incorporating treatment selection , 2003, Statistics in medicine.

[24]  Nigel Stallard,et al.  Conditionally unbiased estimation in phase II/III clinical trials with early stopping for futility , 2013, Statistics in medicine.