Sample size determination for clustered count data.

We consider the problem of sample size determination for count data. Such data arise naturally in the context of multicenter (or cluster) randomized clinical trials, where patients are nested within research centers. We consider cluster-specific and population-averaged estimators (maximum likelihood based on generalized mixed-effect regression and generalized estimating equations, respectively) for subject-level and cluster-level randomized designs, respectively. We provide simple expressions for calculating the number of clusters when comparing event rates of two groups in cross-sectional studies. The expressions we derive have closed-form solutions and are based on either between-cluster variation or intercluster correlation for cross-sectional studies. We provide both theoretical and numerical comparisons of our methods with other existing methods. We specifically show that the performance of the proposed method is better for subject-level randomized designs, whereas the comparative performance depends on the rate ratio for the cluster-level randomized designs. We also provide a versatile method for longitudinal studies. Three real data examples illustrate the results.

[1]  R. Hayes,et al.  Simple sample size calculation for cluster-randomized trials. , 1999, International journal of epidemiology.

[2]  Moonseong Heo,et al.  Statistical Power and Sample Size Requirements for Three Level Hierarchical Cluster Randomized Trials , 2008, Biometrics.

[3]  M. Moerbeek Randomization of Clusters Versus Randomization of Persons Within Clusters , 2005 .

[4]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[5]  K Y Liang,et al.  Sample size calculations for studies with correlated observations. , 1997, Biometrics.

[6]  S. Park Sample Size Calculation for Cluster Randomized Trials , 2014 .

[7]  G. Molenberghs Applied Longitudinal Analysis , 2005 .

[8]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[9]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[10]  Subhash Aryal,et al.  Sample Size Determination for Hierarchical Longitudinal Designs with Differential Attrition Rates , 2007, Biometrics.

[11]  B. Giraudeau,et al.  Sample size calculation for multicenter randomized trial: taking the center effect into account. , 2007, Contemporary clinical trials.

[12]  D. Signorini,et al.  Sample size for Poisson regression , 1991 .

[13]  L. Aarons,et al.  Sample Size/Power Calculations for Population Pharmacodynamic Experiments Involving Repeated-Count Measurements , 2009, Journal of biopharmaceutical statistics.

[14]  David M. Murray,et al.  Design and Analysis of Group- Randomized Trials , 1998 .

[15]  Allan B Clark,et al.  Bayesian methods of analysis for cluster randomized trials with count outcome data , 2009, Statistics in medicine.

[16]  Alice S. Whittemore,et al.  Sample Size for Logistic Regression with Small Response Probability , 1981 .

[17]  A Donner,et al.  Current and future challenges in the design and analysis of cluster randomization trials , 2001, Statistics in medicine.

[18]  J. Kalbfleisch,et al.  A Comparison of Cluster-Specific and Population-Averaged Approaches for Analyzing Correlated Binary Data , 1991 .

[19]  R. Pauwels,et al.  Combined salmeterol and fluticasone in the treatment of chronic obstructive pulmonary disease: a randomised controlled trial , 2003, The Lancet.

[20]  E. Demidenko Poisson Regression for Clustered Data , 2007 .

[21]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[22]  J Rochon,et al.  Application of GEE procedures for sample size calculations in repeated measures experiments. , 1997, Statistics in medicine.