An Information Theory Model for Pattern Analysis

The pattern model described in this paper is based on information analysis. It assumes that the collection to be analysed contains contiguous sampling units, which are quadrats or line segments, each characterized by species frequency, or species density, defined in the usual phytosociological sense (see Greig-Smith 1964). It utilizes a procedure similar to Greig-Smith's pattern analysis (1952, 1961), except that information, and not the sum of squares, is partitioned into components. Our information theory model (I-model), and also Greig-Smith's sum of squares model (SS-model), both define pattern as a spatial property and, accordingly, they measure pattern in relation to area or distance. This is in sharp contrast to the topological concept of pattern when attention is focused on the properties of symbol sequences. An example of such a pattern is the mingling of density phases in a transect (see Pielou 1967, 1969) irrespective of any ground scale. The SS-model, and also the I-model which is described in the present paper, both use a nested hierarchical analysis of pattern variation. This variation is measured as sum of squares or information generated by the departure of the observations from a state of perfect homogeneity characterized by identical counts or frequencies in the sampling units. Once this departure is measured then it is compared with an expected departure on the assumption that the individual plant specimens are distributed randomly on the ground. We may note that while the SS-model probably is best suited to analyse measurement data, the I-model should be applied only to data of a categorical type such as counts or frequencies. It may also be stressed that the I-model is computationally less tedious than the SS-model. The relative power of the two models in representing pattern in natural situations, however, has yet to be determined. Any attempt to do so will probably run into difficulties because the observations on which pattern analysis is performed are not independent in most natural situations. It is a direct consequence of this lack of independence that the distributional properties of the residuals in both the models may be inconsistent. The purpose of the present paper is to derive the I-model and to illustrate its application in connection with species data given in Table 1. The entries in Table 1 represent counts of individual tillers of Andropogon scoparius Michx. which were intercepted within 40 cm segments in a line transect. Some environmental data, for use in the SS-model, are also given (see Table 2). The transect traversed three beach ridges, each marking a previous position of the shore line, and two intervening slacks in a grassland community on the shore of Lake Erie at Rondeau Provincial Park.