Adaptive sampling in two-phase designs: a biomarker study for progression in arthritis

Response-dependent two-phase designs are used increasingly often in epidemiological studies to ensure sampling strategies offer good statistical efficiency while working within resource constraints. Optimal response-dependent two-phase designs are difficult to implement, however, as they require specification of unknown parameters. We propose adaptive two-phase designs that exploit information from an internal pilot study to approximate the optimal sampling scheme for an analysis based on mean score estimating equations. The frequency properties of estimators arising from this design are assessed through simulation, and they are shown to be similar to those from optimal designs. The design procedure is then illustrated through application to a motivating biomarker study in an ongoing rheumatology research program. Copyright © 2015 © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

[1]  D. Gladman,et al.  Observational cohort studies: lessons learnt from the University of Toronto Psoriatic Arthritis Program. , 2011, Rheumatology.

[2]  T. Lai SEQUENTIAL ANALYSIS: SOME CLASSICAL PROBLEMS AND NEW CHALLENGES , 2001 .

[3]  A. Gottlieb,et al.  Guidelines of care for the management of psoriasis and psoriatic arthritis: Section 2. Psoriatic arthritis: overview and guidelines of care for treatment with an emphasis on the biologics. , 2008, Journal of the American Academy of Dermatology.

[4]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[5]  M Reilly,et al.  Optimal sampling strategies for two-stage studies. , 1996, American journal of epidemiology.

[6]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[7]  Jerald F. Lawless,et al.  Semiparametric methods for response‐selective and missing data problems in regression , 1999 .

[8]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[9]  E. Ruderman Evaluation and management of psoriatic arthritis: the role of biologic therapy. , 2003, Journal of the American Academy of Dermatology.

[10]  J. Wittes,et al.  The role of internal pilot studies in increasing the efficiency of clinical trials. , 1990, Statistics in medicine.

[11]  Richard J. Cook,et al.  Response-Dependent Sampling with Clustered and Longitudinal Data , 2013 .

[12]  R. Cook,et al.  Response‐dependent two‐phase sampling designs for biomarker studies , 2014 .

[13]  A. Scott,et al.  Fitting Logistic Regression Models in Stratified Case-Control Studies , 1991 .

[14]  J. Lawless Likelihood and Pseudo-likelihood Estimation Based on Response-Biased Observation , 1997 .

[15]  J Halpern,et al.  Multi-stage sampling in genetic epidemiology. , 1997, Statistics in medicine.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Biomarkers for Disease Progression in Rheumatology: A Review and Empirical Study of Two-Phase Designs , 2013 .

[18]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[19]  David J. Spiegelhalter,et al.  Analysis of longitudinal binary data from multiphase sampling , 1998 .

[20]  Per Capita,et al.  About the authors , 1995, Machine Vision and Applications.

[21]  Russell V. Lenth,et al.  Statistical Analysis With Missing Data (2nd ed.) (Book) , 2004 .

[22]  Vinod Chandran,et al.  Soluble biomarkers differentiate patients with psoriatic arthritis from those with psoriasis without arthritis. , 2010, Rheumatology.

[23]  S J Pocock,et al.  Interim analyses for randomized clinical trials: the group sequential approach. , 1982, Biometrics.

[24]  A. Emery,et al.  Optimal experiment design , 1998 .

[25]  Valerii Fedorov,et al.  Optimal dose‐finding designs with correlated continuous and discrete responses , 2012, Statistics in medicine.

[26]  Brajendra C. Sutradhar ISS-2012 proceedings volume on longitudinal data analysis subject to measurement errors, missing values, and/or outliers , 2013 .

[27]  Thomas R. Fleming,et al.  Auxiliary outcome data and the mean score method , 1994 .

[28]  D Schaubel,et al.  Two-stage sampling for etiologic studies. Sample size and power. , 1997, American journal of epidemiology.

[29]  Margaret S. Pepe,et al.  A mean score method for missing and auxiliary covariate data in regression models , 1995 .

[30]  Yi-Hau Chen,et al.  A Pseudoscore Estimator for Regression Problems With Two-Phase Sampling , 2003 .

[31]  Chris J. Wild,et al.  Fitting prospective regression models to case-control data , 1991 .

[32]  G. Dunn,et al.  Screening for stratification in two-phase ('two- stage') epidemiological surveys , 1995, Statistical methods in medical research.

[33]  S. Lohr Accurate Multivariate Estimation Using Triple Sampling , 1990 .

[34]  J. Neyman Contribution to the Theory of Sampling Human Populations , 1938 .

[35]  W. Rosenberger,et al.  Inference from a sequential design: proof of a conjecture by Ford and Silvey , 1999 .

[36]  Nilanjan Chatterjee,et al.  Design and analysis of two‐phase studies with binary outcome applied to Wilms tumour prognosis , 1999 .