Cluster randomised trials with different numbers of measurements at baseline and endline: Sample size and optimal allocation

Background/Aims: Published methods for sample size calculation for cluster randomised trials with baseline data are inflexible and primarily assume an equal amount of data collected at baseline and endline, that is, before and after the intervention has been implemented in some clusters. We extend these methods to any amount of baseline and endline data. We explain how to explore sample size for a trial if some baseline data from the trial clusters have already been collected as part of a separate study. Where such data aren’t available, we show how to choose the proportion of data collection devoted to the baseline within the trial, when a particular cluster size or range of cluster sizes is proposed. Methods: We provide a design effect given the cluster size and correlation parameters, assuming different participants are assessed at baseline and endline in the same clusters. We show how to produce plots to identify the impact of varying the amount of baseline data accounting for the inevitable uncertainty in the cluster autocorrelation. We illustrate the methodology using an example trial. Results: Baseline data provide more power, or allow a greater reduction in trial size, with greater values of the cluster size, intracluster correlation and cluster autocorrelation. Conclusion: Investigators should think carefully before collecting baseline data in a cluster randomised trial if this is at the expense of endline data. In some scenarios, this will increase the sample size required to achieve given power and precision.

[1]  Optimal Allocation of Interviews to Baseline and Endline Surveys in Place-Based Randomized Trials and Quasi-Experiments , 2018, Evaluation review.

[2]  Jonathan L. Blitstein,et al.  Design and analysis of group-randomized trials: a review of recent methodological developments. , 2004, American journal of public health.

[3]  Luke B Connelly,et al.  Balancing the number and size of sites: an economic approach to the optimal design of cluster samples. , 2003, Controlled clinical trials.

[4]  S. Bremner,et al.  Increased risk of type I errors in cluster randomised trials with small or medium numbers of clusters: a review, reanalysis, and simulation study , 2016, Trials.

[5]  Steven Teerenstra,et al.  A simple sample size formula for analysis of covariance in cluster randomized trials , 2012, Statistics in medicine.

[6]  N. Day,et al.  Cluster randomization in large public health trials: the importance of antecedent data. , 1992, Statistics in medicine.

[7]  Lawrence H Moulton,et al.  Covariate-based constrained randomization of group-randomized trials , 2004, Clinical trials.

[8]  George F Borm,et al.  A simple sample size formula for analysis of covariance in randomized clinical trials. , 2007, Journal of clinical epidemiology.

[9]  R. Mccaffrey,et al.  A methodological review of “method skeptic” reports , 1992, Neuropsychology Review.

[10]  Valérie Buthion,et al.  ColoNav: patient navigation for colorectal cancer screening in deprived areas – Study protocol , 1999, BMC Cancer.

[11]  S. Tollman,et al.  Community mobilization to modify harmful gender norms and reduce HIV risk: results from a community cluster randomized trial in South Africa , 2018, Journal of the International AIDS Society.

[12]  G M Raab,et al.  Balance in cluster randomized trials. , 2001, Statistics in medicine.

[13]  M. Taljaard,et al.  Sample size calculations for stepped wedge and cluster randomised trials: a unified approach , 2016, Journal of clinical epidemiology.

[14]  M. Taljaard,et al.  A review of the use of covariates in cluster randomized trials uncovers marked discrepancies between guidance and practice , 2015, Journal of clinical epidemiology.

[15]  D. Ashby,et al.  Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. , 2006, International journal of epidemiology.

[16]  L. Bourke,et al.  Cluster randomised trials with repeated cross sections: alternatives to parallel group designs , 2015, BMJ : British Medical Journal.

[17]  R. Hayes,et al.  Simple sample size calculation for cluster-randomized trials. , 1999, International journal of epidemiology.

[18]  M. Bell,et al.  Generalized estimating equations in cluster randomized trials with a small number of clusters: Review of practice and simulation study , 2016, Clinical trials.

[19]  Karen Tu,et al.  Allocation techniques for balance at baseline in cluster randomized trials: a methodological review , 2012, Trials.

[20]  Steven Teerenstra,et al.  Sample size calculation for stepped wedge and other longitudinal cluster randomised trials , 2016, Statistics in medicine.

[21]  S. Maman,et al.  A cluster randomized-controlled trial of a community mobilization intervention to change gender norms and reduce HIV risk in rural South Africa: study design and intervention , 2015, BMC Public Health.

[22]  A. Girling,et al.  Intra-cluster and inter-period correlation coefficients for cross-sectional cluster randomised controlled trials for type-2 diabetes in UK primary care , 2016, Trials.