Improving performance of the k-nearest neighbor classifier by tolerant rough sets

The authors report on efforts to improve the performance of k-nearest neighbor classification by introducing the tolerant rough set. We relate the tolerant rough relation with object similarity. Two objects are called similar if and only if these two objects satisfy the requirements of the tolerant rough relation. Hence, the tolerant rough set is used to select objects from the training data and constructing the similarity function. A genetic algorithm (GA) algorithm is used for seeking optimal similarity metrics. Experiments have been conducted on some artificial and real world data, and the results show that our algorithm can improve the performance of the k-nearest neighbor classification, and achieve a higher accuracy compared with the C4.5 system.