Multi-scale Kernel discrminant analysis

The bandwidth that minimizes the mean integrated square error of a kernel density estimator may not always be good when the density estimate is used for classification purpose. On the other hand cross-validation based techniques for choosing bandwidths may not be computationally feasible when there are many competing classes. Instead of concentrating on a single optimum bandwidth for each population density estimate, it would be more useful in practice to look at the results for different scales of smoothing. This paper presents such a multi-scale approach for classification using kernel density estimates along with a graphical device that leads to a more informative discriminant analysis. Usefulness of this proposed methodology has been illustrated using some benchmark data sets.