Acquisition of optimal attribute subset through genetic algorithm using GNP-based class association rule mining

Attribute selection is a technique to prune less relevant information and discover high-quality knowledge. It is especially useful for the classification of a large database, because the preprocessing of data increases the possibility that predictor attributes given to the mining algorithm become more relevant to the class attribute. In this paper, a method to acquire the optimal attribute subset for the genetic network programming (GNP) based class association rule mining has been proposed, and this attribute selection process using genetic algorithm (GA) leads to a higher accuracy for classification. Class association rule mining through GNP is conducted with a small subset of data rather than the original large number of attributes; thus simple but important rules are obtained for classification while the local optimal problem is avoided. Simulation results with educational data show that the classification accuracy is largely improved from 52.73 to 74.54%, when classification is made using the optimal attribute subset. © 2014 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.