Cluster analysis and disease mapping—why, when, and how? A step by step guide

Growing public awareness of environmental hazards has led to an increased demand for public health authorities to investigate geographical clustering of diseases. Although such cluster analysis is nearly always ineffective in identifying causes of disease, it often has to be used to address public concern about environmental hazards. Interpreting the resulting data is not straightforward, however, and this paper presents a guide for the non-specialist. The pitfalls include the fact that cluster analyses are usually done post hoc, and not as a result of a prior hypothesis. This is particularly true for investigations prompted by reported clusters, which have the inherent danger of overestimating the disease rate through "boundary shrinkage" of the population from which the cases are assumed to have arisen. In disease surveillance the problem of making multiple comparisons can be overcome by testing for clustering and autocorrelation. When rates of disease are illustrated in disease maps undue focus on areas where random fluctuation is greatest can be minimised by smoothing techniques. Despite the fact that cluster analyses rarely prove fruitful in identifying causation, they may-like single case reports-have the potential to generate new knowledge.

[1]  D. Clayton,et al.  Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. , 1987, Biometrics.

[2]  G. Newman,et al.  CONFIDENCE INTERVALS , 1987, The Lancet.

[3]  Martin J. Gardner,et al.  Review of Reported Increases of Childhood Cancer Rates in the Vicinity of Nuclear Installations in the Uk , 1989 .

[4]  Guidelines for investigating clusters of health events. , 1990, MMWR. Recommendations and reports : Morbidity and mortality weekly report. Recommendations and reports.

[5]  R R Neutra,et al.  Counterpoint from a cluster buster. , 1990, American journal of epidemiology.

[6]  K J Rothman,et al.  A sobering start for the cluster busters' conference. , 1990, American journal of epidemiology.

[7]  D. Jolley,et al.  Incidence of cancers of the larynx and lung near incinerators of waste solvents and oils in Great Britain , 1992, The Lancet.

[8]  H. Tillett,et al.  Geographical and Environmental Epidemiology: Methods for Small-Area Studies. , 1993, Epidemiology and Infection.

[9]  J. Estève,et al.  Statistical methods in cancer research. Volume IV. Descriptive epidemiology. , 1998, IARC scientific publications.

[10]  G M Jacquez,et al.  Disease Models Implicit in Statistical Tests of Disease Clustering , 1995, Epidemiology.

[11]  P. Elliott,et al.  Investigation of disease risks in small areas. , 1995, Occupational and environmental medicine.

[12]  G. Shaddick,et al.  Spatial statistical methods in environmental epidemiology: a critique , 1995, Statistical methods in medical research.

[13]  I Kleinschmidt,et al.  Cancer incidence near municipal solid waste incinerators in Great Britain , 1996, British Journal of Cancer.

[14]  Jack Cuzick,et al.  Geographical and environmental epidemiology : methods for small-area studies , 1997 .