Selecting optimal statistical tools.

The first consideration in analyzing any data is to consider the architecture of study design and quality of the data. Numerous study designs exist, and they vary in their susceptibility to distorting biases. Maclure identifies 32 distinct combinations of taxonomic dimensions by ranking study designs on axes regarding (in decreasing order of importance) randomization, aggregation, parameter timing, selection/allocation, blindedness, and proximity. Maclure uses the term aggregation in favor of the often-used term ecologic because groupings may be based on nonspatial units (eg, occupation, family membership, birth periods, etc.), rather than places per se. Parameter timing refers to whether exposure or outcome is measured first or simultaneously; proximity refers to the relative time of measurements, ranging from historic to concurrent. His final figure, which summarizes a composition of six preliminary figures of taxonomic dimension, is reproduced here (Figure 1). Validity is strongest as one moves to the right of center in his summary figure. This taxonomy indicates that blinded randomized trials provide the strongest designs, nonrandomized aggregational studies suffer the greatest susceptibility to bias, and nonrandomized person-based studies rest between. After selection of an optimal study design and collection of quality data, one is faced with the tasks of organizing, summarizing, analyzing, and displaying that data. Statistical approaches and concepts that may be useful in hospital epidemiology were listed in a previous column (1993;14:161-162). Details describing the use of related statistical methods also have been published in this journal. This month's column provides algorithms to guide in selecting the best method(s) for analysis and display of surveillance or research data. This may involve numeric and graphical methods. Numeric analysis and graphical display are related but fundamentally different processes. Numeric data analysis is data reductive: a wealth of observations is reduced to curt summary statistics. Graphical display can provide overviews that preserve the richness of data sets and promote visual detection of (sometimes unanticipated) patterns. When done well, graphs show general data behavior and specific detail with unparalleled impact. A number of tools for numeric analysis are available to us, namely the following.