The influence of observations on misclassification probability estimates in linear discriminant analysis

SUMMARY The influence of observations upon misclassification probability estimates in linear discriminant analysis is examined. For a single observation, an exact expression is given confirming two surprising results of Campbell (1978). It also shows that the influence of an observation is governed by two quantities: (a) the difference between its linear discriminant score and that for its sample mean, and (b) its atypicality estimate for its own population. This is analogous to linear model results where an observation's influence depends upon its residual and its leverage. However, important differences from the regression situation are noted. Contours of this one-at-a-time influence can be superimposed on the plot introduced by Critchley & Ford (1985). Examining the joint influence of several observations -is complicated by the computational burden and by possible masking effects. A quadratic approximation is developed for this problem using Pregibon's (1981) case weights scheme. This approximation has an error of order n-3, where n denotes the assumed common order of the sample sizes. An example is given.