Advanced statistics: statistical methods for analyzing cluster and cluster-randomized data.

Sometimes interventions in randomized clinical trials are not allocated to individual patients, but rather to patients in groups. This is called cluster allocation, or cluster randomization, and is particularly common in health services research. Similarly, in some types of observational studies, patients (or observations) are found in naturally occurring groups, such as neighborhoods. In either situation, observations within a cluster tend to be more alike than observations selected entirely at random. This violates the assumption of independence that is at the heart of common methods of statistical estimation and hypothesis testing. Failure to account for the dependence between individual observations and the cluster to which they belong can have profound implications on the design and analysis of such studies. Their p-values will be too small, confidence intervals too narrow, and sample size estimates too small, sometimes to a dramatic degree. This problem is similar to that caused by the more familiar "unit of analysis error" seen when observations are repeated on the same subjects, but are treated as independent. The purpose of this paper is to provide an introduction to the problem of clustered data in clinical research. It provides guidance and examples of methods for analyzing clustered data and calculating sample sizes when planning studies. The article concludes with some general comments on statistical software for cluster data and principles for planning, analyzing, and presenting such studies.

[1]  J. Hutton,et al.  Are distinctive ethical principles required for cluster randomized controlled trials? , 2001, Statistics in medicine.

[2]  J. Simpson,et al.  Accounting for cluster randomization: a review of primary prevention trials, 1990 through 1993. , 1995, American journal of public health.

[3]  M K Campbell,et al.  Cluster randomised trials: time for improvement , 1998, BMJ.

[4]  Tx Station Stata Statistical Software: Release 7. , 2001 .

[5]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[6]  Richard J Lilford,et al.  Ethical issues in the design and conduct of cluster randomised controlled trials , 1999, BMJ.

[7]  S G Thompson,et al.  The design and analysis of paired cluster randomized trials: an application of meta-analysis techniques. , 1997, Statistics in medicine.

[8]  J. Sterne,et al.  Methods for evaluating area-wide and organisation-based interventions in health and health care: a systematic review. , 1999, Health technology assessment.

[9]  D J Spiegelhalter,et al.  Bayesian methods for cluster randomized trials with continuous responses. , 2001, Statistics in medicine.

[10]  A Donner,et al.  A methodological review of non-therapeutic intervention trials employing cluster randomization, 1979-1989. , 1990, International journal of epidemiology.

[11]  S M Kerry,et al.  Unequal cluster sizes for trials in English and Welsh general practice: implications for sample size calculations. , 2001, Statistics in medicine.

[12]  A. Donner,et al.  Randomization by cluster. Sample size requirements and analysis. , 1981, American journal of epidemiology.

[13]  J M Bland,et al.  Statistics notes: Sample size in cluster randomisation , 1998, BMJ.

[14]  R Z Omar,et al.  Bayesian methods of analysis for cluster randomized trials with binary outcome data. , 2001, Statistics in medicine.

[15]  J Martin Bland,et al.  Trials randomised in clusters , 1997, BMJ.

[16]  S. Chinn,et al.  Components of variance and intraclass correlations for the design of community-based surveys and intervention studies: data from the Health Survey for England 1994. , 1999, American journal of epidemiology.

[17]  M K Campbell,et al.  Extending the CONSORT statement to cluster randomized trials: for discussion. , 2001, Statistics in medicine.

[18]  A. Brett,et al.  Ethical aspects of human experimentation in health services research. , 1991, JAMA.

[19]  C L Christiansen,et al.  Improving the Statistical Approach to Health Care Provider Profiling , 1997, Annals of Internal Medicine.

[20]  R. Milne,et al.  Positron emission tomography: establishing priorities for health technology assessment. , 1999, Health technology assessment.

[21]  Douglas G Altman,et al.  Statistics Notes: Units of analysis , 1997, BMJ.

[22]  J M Bland,et al.  The intracluster correlation coefficient in cluster randomisation , 1998, BMJ.

[23]  Allan Donner,et al.  Methods in health service research: Evaluation of health interventions at area and organisation level , 1999 .

[24]  D. G. Altman,et al.  Randomisation and baseline comparisons in clinical trials , 1990, The Lancet.

[25]  David J Torgerson,et al.  Contamination in trials: is cluster randomisation the answer? , 2001, BMJ : British Medical Journal.

[26]  Jerome Cornfield,et al.  SYMPOSIUM ON CHD PREVENTION TRIALS: DESIGN ISSUES IN TESTING LIFE STYLE INTERVENTIONRANDOMIZATION BY GROUP: A FORMAL ANALYSIS , 1978 .

[27]  I. Olkin,et al.  Improving the quality of reporting of randomized controlled trials. The CONSORT statement. , 1996, JAMA.

[28]  S. Ennett,et al.  How effective is drug abuse resistance education? A meta-analysis of Project DARE outcome evaluations. , 1994, American journal of public health.