Nonparametric Bayes Risk Estimation for Pattern Classification

The performance of a pattern classification system is often evaluated based on the risk committed by the classification procedure. The minimum attainable risk is the Bayes risk. Therefore, the Bayes risk can be used as a measure of the intrinsic complexity of the system, and it also serves as a reference of the optimality measure of a classification procedure. There are many practical situations in which the nonparametric methods may have to be called upon to estimate the Bayes risk. One of the nonparametric methods is via the probability density estimation technique. The convergence properties of this estimation technique are studied under fairly general assumptions. In the computer experiments reported, the estimate of the Bayes risk is taken as the sample mean of the density estimate by making use of the leave-one-out method. The probability density estimate used is the one proposed by Loftsgaarden and Quesenberry. This estimate is shown to be, in general, superior to the risk associated with a Bayes-like decision rule based on the error-counting scheme. This estimate is also compared experimentally with the risk estimate associated with the nearest neighbor rule.