A Practical View of Suboptimal Bayesian Classification with Radial Gaussian Kernels

For pattern classification in a multi-dimensional space, the minimum misclassification rate is obtained by using the Bayes criterion. Kernel estimators or probabilistic neural networks provide a good way to evaluate the probability densities of each class of data and are an interesting parallel implementation of the Bayesian classifier [1]. However, their training procedure leads to a very high number of neurons when large datasets are available; the classifier then becomes too complex and time consuming for on-line operation. Suboptimal Bayesian classifiers based on radial Gaussian kernels [2] uses an iterative unsupervised learning method based on vector quantization to obtain a significant simplification of the network structure, while keeping sufficiently accurate estimations of probability densities. In this paper, we study the vector quantization problem and the effects of codebook size and data space dimension on the optimal width factors of the radial Gaussian kernels used in the estimation.