The training of neural classifiers with condensed datasets

In this paper we apply a k-nearest-neighbor-based data condensing algorithm to the training set of multilayer perceptron neural networks. By removing the overlapping data and retaining only training exemplars adjacent to the decision boundary we are able to significantly speed the network training time while achieving an undegraded misclassification rate compared to a network trained on the unedited training set. We report results on a range of synthetic and real datasets that indicate that a training speed-up of an order of magnitude is typical.