Improving mining of medical data by outliers prediction

In the paper a new outlier prediction method is presented that should improve the classification performance when mining the medical data. The method introduces the class confusion score metric that is based on the classification results of a set of classifiers, induced by an evolutionary decision tree induction algorithm. The classification improvement should be achieved by removing the identified outliers from a training set. Our proposition is that a classifier trained by a filtered dataset captures a better, more general knowledge model and should therefore perform better also on unseen cases. The proposed method is applied on the two cardio-vascular datasets and the obtained results are discussed.