Bayesian classification of catchments using spatial data: A first step to improved modelling of catchment effects on stream ecological condition: MODSIM05 - International Congress on Modelling and Simulation: Advances and Applications for Management and Decision Making, Proceedings

A major challenge facing freshwater ecologists and managers is the development of models that link stream ecological condition to catchment scale effects, such as land use. Previous attempts to make such models have followed two general approaches. The bottom-up approach employs mechanistic models, which can quickly become too complex to be useful. The top-down approach employs empirical models derived from large data sets, and has often suffered from large amounts of unexplained variation in stream condition. We believe that the lack of success of both modelling approaches may be at least partly explained by scientists considering too wide a breadth of catchment type. Thus, we believe that by stratifying large sets of catchments into groups of similar types prior to modelling, both types of models may be improved. This paper describes preliminary work using a Bayesian classification software package, ‘Autoclass’ (Cheeseman and Stutz 1996) to create classes of catchments within the Murray Darling Basin based on physiographic data. Autoclass uses a model-based classification method that employs finite mixture modelling and trades off model fit versus complexity, leading to a parsimonious solution. The software provides information on the posterior probability that the classification is ‘correct’ and also probabilities for alternative classifications. The importance of each attribute in defining the individual classes is calculated and presented, assisting description of the classes. Each case is ‘assigned’ to a class based on membership probability, but the probability of membership of other classes is also provided. This feature deals very well with cases that do not fit neatly into a larger class. Lastly, Autoclass requires the user to specify the measurement error of continuous variables. Catchments were derived from the Australian digital elevation model. Physiographic data werederived from national spatial data sets. There was very little information on measurement errors for the spatial data, and so a conservative error of 5% of data range was adopted for all continuous attributes. The incorporation of uncertainty into spatial data sets remains a research challenge. The results of the classification were very encouraging. The software found nine classes of catchments in the Murray Darling Basin. The classes grouped together geographically, and followed altitude and latitude gradients, despite the fact that these variables were not included in the classification. Descriptions of the classes reveal very different physiographic environments, ranging from dry and flat catchments (i.e. lowlands), through to wet and hilly catchments (i.e. mountainous areas). Rainfall and slope were two important discriminators between classes. These two attributes, in particular, will affect the ways in which the stream interacts with the catchment, and can thus be expected to modify the effects of land use change on ecological condition. Thus, realistic models of the effects of land use change on streams would differ between the different types of catchments, and sound management practices will differ. A small number of catchments were assigned to their primary class with relatively low probability. These catchments lie on the boundaries of groups of catchments, with the second most likely class being an adjacent group. The locations of these ‘uncertain’ catchments show that the Bayesian classification dealt well with cases that do not fit neatly into larger classes. Although the results are intuitive, we cannot yet assess whether the classifications described in this paper would assist the modelling of catchment scale effects on stream ecological condition. It is most likely that catchment classification and modelling will be an iterative process, where the needs of the model are used to guide classification, and the results of classifications used to suggest further refinements to models.