How to make biological surveys go further with generalised linear models

Abstract The cost of surveys, both in terms of dollars and manpower, will prevent total inventories of complete regions being completed, if in fact ever undertaken, before decisions have to be made to change current land use practices. There exists a need to be able to extrapolate existing location-specific information over complete regions with increased confidence in the subsequent spatial predictions. Recent developments in statistical modelling provide methods appropriate to many types of biological data and taken together with the application of regression-diagnostic techniques offer the biologist or land manager improved, in terms of reliability and spatial completeness, species distribution data bases on which to base conservation decisions. This paper describes one component of such development and the application of three regression-diagnostic techniques: the use of residuals to test the statistical assumptions implicit in the fitted regression model; the use of estimates of potential influence each observation has on the fitted model; and the use of the coefficient of sensitivity of the model to individual observations. Guidelines are given to assist with the construction of a predictive model from a group of potential explanatory or predictor variables. The use of generalised linear regression models and regression diagnostics is discussed in terms of their impact on survey design.

[1]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[2]  Lw Braithwaite,et al.  Studies on the Arboreal Marsupial Fauna of Eucalypt Forests Being Harvested for Wood Pulp at Eden, N.s.w. Iii. Relationships Between Faunal Densities, Eucalypt Occurrence and Foliage Nutrients, and Soil Parent Materials. , 1984 .

[3]  G. J. G. Upton,et al.  An Introduction to Statistical Modelling , 1983 .

[4]  A. O. Nicholls,et al.  On predicting the presence of birds in Eucalyptus forest types , 1989 .

[5]  Chris Margules,et al.  Patterns in the distributions of species and the selection of nature reserves: An example from Eucalyptus forests in South-eastern New South Wales , 1989 .

[6]  A. O. Nicholls,et al.  Measurement of the realized qualitative niche: environmental niches of five Eucalyptus species , 1990 .

[7]  S. Chatterjee,et al.  Influential Observations, High Leverage Points, and Outliers in Linear Regression , 1986 .

[8]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[9]  Mike P. Austin,et al.  Vegetation survey design for conservation: Gradsect sampling of forests in North-eastern New South Wales , 1989 .

[10]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[11]  Ross B. Cunningham,et al.  Altitudinal distribution of several eucalypt species in relation to other environmental factors in southern New South Wales , 1983 .

[12]  K. R. W. Brewer,et al.  The use of gradient directed transects or gradsects in natural resource surveys , 1985 .

[13]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[14]  D. Pregibon Logistic Regression Diagnostics , 1981 .

[15]  R. Welsch,et al.  The Hat Matrix in Regression and ANOVA , 1978 .

[16]  M. Brooker,et al.  Forest Trees of Australia , 1984 .

[17]  M. Dudziński,et al.  Studies on the arboreal marsupial fauna of eucalypt forests being harvested for woodpulp at Eden, N.S.W. II. Relationship between the fauna density, richness and diversity, and measured variables of the habitat , 1983 .

[18]  D. A. Williams,et al.  Generalized Linear Model Diagnostics Using the Deviance and Single Case Deletions , 1987 .

[19]  Andy H. Lee Diagnostic Displays for Assessing Leverage and Influence in Generalized Linear Models , 1987 .