The optimum nonlinear features for a scatter criterion in discriminant analysis

The scatter criterion applied to nonlinear mappings is discussed. The trace of the between-class scatter matrix normalized by the mixture scatter matrix is chosen as the criterion to measure the degree of overlap among class distributions, and it is shown that the {\em a posteriori} probability functions maximize this criterion. When a set of nonlinear features is selected, the projections of the {\em a posteriori} probability functions into the selected feature space are found, and they approximate the {\em a posteriori} probability functions. The mean-square error of the approximation represents the difference between the criterion value in the feature space and its optimum value. The estimation of the optimum criterion value and the evaluation of an additional new feature are also discussed.