An integrated population‐averaged approach to the design, analysis and sample size determination of cluster‐unit trials

While the mixed model approach to cluster randomization trials is relatively well developed, there has been less attention given to the design and analysis of population-averaged models for randomized and non-randomized cluster trials. We provide novel implementations of familiar methods to meet these needs. A design strategy that selects matching control communities based upon propensity scores, a statistical analysis plan for dichotomous outcomes based upon generalized estimating equations (GEE) with a design-based working correlation matrix, and new sample size formulae are applied to a large non-randomized study to reduce underage drinking. The statistical power calculations, based upon Wald tests for summary statistics, are special cases of a general power method for GEE.

[1]  T. Derouen,et al.  A Covariance Estimator for GEE with Improved Small‐Sample Properties , 2001, Biometrics.

[2]  A Oakley,et al.  Experimentation and social interventions: a forgotten but important history , 1998, BMJ.

[3]  D M Murray,et al.  Analysis of data from group-randomized trials with repeat observations on the same groups. , 1998, Statistics in medicine.

[4]  A. V. Peterson,et al.  A comparison of generalized linear mixed model procedures with estimating equations for variance and covariance parameter estimation in longitudinal studies and group randomized trials , 2001, Statistics in medicine.

[5]  A. Sashegyi,et al.  Application of a generalized random effects regression model for cluster-correlated longitudinal data to a school-based smoking prevention trial. , 2000, American journal of epidemiology.

[6]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[7]  K Y Liang,et al.  Sample size calculations for studies with correlated observations. , 1997, Biometrics.

[8]  E. Korn,et al.  Regression analysis with clustered data. , 1994, Statistics in medicine.

[9]  A. Donner,et al.  Randomization by cluster. Sample size requirements and analysis. , 1981, American journal of epidemiology.

[10]  R J Carroll,et al.  On design considerations and randomization-based inference for community intervention trials. , 1996, Statistics in medicine.

[11]  S G Thompson,et al.  Analysis of cluster randomized trials with repeated cross-sectional binary measurements. , 2001, Statistics in medicine.

[12]  Z. Feng,et al.  A comparison of statistical methods for clustered data analysis with Gaussian error. , 1996, Statistics in medicine.

[13]  R. D'Agostino Adjustment Methods: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non‐Randomized Control Group , 2005 .

[14]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[15]  James Rochon,et al.  Application of GEE procedures for sample size calculations in repeated measures experiments , 1998 .

[16]  Bernard R. Rosner,et al.  Fundamentals of Biostatistics. , 1992 .

[17]  H A Feldman,et al.  Cohort versus cross-sectional design in large field trials: precision, sample size, and a unifying model. , 1994, Statistics in medicine.

[18]  A Donner,et al.  Current and future challenges in the design and analysis of cluster randomization trials , 2001, Statistics in medicine.

[19]  J. Kalbfleisch,et al.  A Comparison of Cluster-Specific and Population-Averaged Approaches for Analyzing Correlated Binary Data , 1991 .

[20]  D. Jacobs,et al.  PARAMETERS TO AID IN THE DESIGN AND ANALYSIS OF COMMUNITY TRIALS: INTRACLASS CORRELATIONS FROM THE MINNESOTA HEART HEALTH PROGRAM , 1994, Epidemiology.

[21]  W. Pan,et al.  Small‐sample adjustments in using the sandwich variance estimator in generalized estimating equations , 2002, Statistics in medicine.

[22]  P S Albert,et al.  A generalized estimating equations approach for spatially correlated binary data: applications to the analysis of neuroimaging data. , 1995, Biometrics.

[23]  B. Short,et al.  Intraclass correlation among measures related to alcohol use by young adults: estimates, correlates and applications in intervention studies. , 1995, Journal of studies on alcohol.

[24]  L. Ryan,et al.  Analysis of dichotomous outcome data for community intervention studies , 2000, Statistical methods in medical research.

[25]  W Pan,et al.  Sample size and power calculations with correlated binary data. , 2001, Controlled clinical trials.

[26]  D M Murray,et al.  Planning for the appropriate analysis in school-based drug-use prevention studies. , 1990, Journal of consulting and clinical psychology.

[27]  P. Albert,et al.  Models for longitudinal data: a generalized estimating equation approach. , 1988, Biometrics.

[28]  Linda M. Collins,et al.  Group Comparability , 1984 .

[29]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[30]  P J Hannan,et al.  Intraclass correlation among common measures of adolescent smoking: estimates, correlates, and applications in smoking prevention studies. , 1994, American journal of epidemiology.

[31]  M. Fay,et al.  Small‐Sample Adjustments for Wald‐Type Tests Using Sandwich Estimators , 2001, Biometrics.

[32]  W. Shadish,et al.  Social Experiments: Some Developments over the Past Fifteen Years , 1994 .

[33]  F B Hu,et al.  Intraclass correlation estimates in a school-based smoking prevention study. Outcome and mediating variables, by sex and ethnicity. , 1996, American journal of epidemiology.

[34]  M. Gail,et al.  Community intervention trial for smoking cessation (COMMIT): II. Changes in adult cigarette smoking prevalence. , 1995, American journal of public health.