Evolutionary Feature Construction Using Information Gain and Gini Index

Feature construction using genetic programming is carried out to study the effect on the performance of a range of classification algorithms with the inclusion of the evolved attributes. Two different fitness functions are used in the genetic program, one based on information gain and the other based on the gini index. The classification algorithms used are three classification tree algorithms, namely C5, CART, CHAID and an MLP neural network. The intention of the research is to ascertain if the decision tree classification algorithms benefit more using features constructed using a genetic programme whose fitness function incorporates the same fundamental learning mechanism as the splitting criteria of the associated decision tree.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Guido Governatori,et al.  A Defeasible Logic of Policy-Based Intention , 2003 .

[3]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[4]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[5]  David Biggs,et al.  A method of choosing multiway partitions for classification and decision trees , 1991 .

[6]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[7]  Fernando E. B. Otero,et al.  Genetic Programming for Attribute Construction in Data Mining , 2002, EuroGP.

[8]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[9]  D. Treigueiros,et al.  The application of neural network based methods to the extraction of knowledge from accounting reports , 1991, Proceedings of the Twenty-Fourth Annual Hawaii International Conference on System Sciences.

[10]  Zijian Zheng,et al.  Effects of different types of new attribute on constructive induction , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.

[11]  Ibrahim Kuscu,et al.  A genetic constructive induction model , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[12]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[13]  George D. Smith,et al.  The Effect of Evolved Attributes on Classification Algorithms , 2003, Australian Conference on Artificial Intelligence.