Support Vector Machines for Knowledge Discovery

In this paper, we apply support vector machine (SVM) to knowledge discovery (KD) and confirm its effectiveness with a benchmark data set. SVM has been successfully applied to problems in various domains. However, its effectiveness as a KD method is unknown. We propose SVM for KD, which deals with a classification problem with a binary class, by rescaling each attribute based on z-scores. SVM for KD can sort attributes with respect to their effectiveness in discriminating classes. Moreover, SVM for KD can discover crucial examples for discrimination. We settled six discovery tasks with the meningoencephalitis data set, which is a benchmark data set in KD. A domain expert ranked the discovery outcomes of SVM for KD from one to five with respect to several criteria. Selected attributes in six tasks are all valid and useful: their average scores are 3.8-4.0. Discovering order of attributes about usefulness represents a challenging problem. However, concerning this problem, our method achieved a score of more than or equal to 4.0 in three tasks. Besides, crucial examples for discrimination and typical examples for each class agree with medical knowledge. These promising results demonstrate the effectiveness of our approach.