A selective Bayes Classifier for classifying incomplete data based on gain ratio

Actual data sets are often incomplete because of various kinds of reasons. Although numerous algorithms about classification have been proposed, most of them deal with complete data. So methods of constructing classifiers for incomplete data deserve more attention. By analyzing main methods of processing incomplete data for classification, this paper presents a selective Bayes Classifier for classifying incomplete data with a simpler formula for computing gain ratio. The proposed algorithm needs no assumption about data sets that are necessary for previous methods of processing incomplete data in classification. Experiments on 12 benchmark incomplete data sets show that this method can greatly improve the accuracy of classification. Furthermore, it can sharply reduce the number of attributes and so can greatly simplify the data sets and classifiers.