An empirical analysis of the probabilistic K-nearest neighbour classifier

The probabilistic nearest neighbour (PNN) method for pattern recognition was introduced to overcome a number of perceived shortcomings of the nearest neighbour (NN) classifiers namely the lack of any probabilistic semantics when making predictions of class membership. In addition the NN method possesses no inherent principled framework for inferring the number of neighbours, K, nor indeed associated parameters related to the chosen metric. Whilst the Bayesian inferential methodology underlying the PNN classifier undoubtedly overcomes these shortcomings there has been to date no extensive systematic study of the performance of the PNN method nor any comparison with the standard non-probabilistic approach. We address this issue by undertaking an extensive empirical study which highlights the essential characteristics of PNN when compared to a cross-validated K-NN.