Feature Construction and Dimension Reduction Using Genetic Programming

This paper describes a new approach to the use of genetic programming (GP) for feature construction in classification problems. Rather than wrapping a particular classifier for single feature construction as in most of the existing methods, this approach uses GP to construct multiple (high-level) features from the original features. These constructed features are then used by decision trees for classification. As feature construction is independent of classification, the fitness function is designed based on the class dispersion and entropy. This approach is examined and compared with the standard decision tree method, using the original features, and using a combination of the original features and constructed features, on 12 benchmark classification problems. The results show that the new approach outperforms the standard way of using decision trees on these problems in terms of the classification performance, dimension reduction and the learned decision tree size.

[1]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[2]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[3]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[4]  Sara Silva,et al.  GPLAB A Genetic Programming Toolbox for MATLAB , 2004 .

[5]  Fernando E. B. Otero,et al.  Genetic Programming for Attribute Construction in Data Mining , 2002, EuroGP.

[6]  Ernesto Costa,et al.  Dynamic Limits for Bloat Control: Variations on Size and Depth , 2004, GECCO.

[7]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[8]  J. David Schaffer,et al.  Proceedings of the third international conference on Genetic algorithms , 1989 .

[9]  John R. Koza,et al.  Genetic programming (videotape): the movie , 1992 .

[10]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[11]  Krzysztof Krawiec,et al.  Visual learning by coevolutionary feature synthesis , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Anikó Ekárt,et al.  Using genetic programming and decision trees for generating structural descriptions of four bar mechanisms , 2003, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[13]  George D. Smith,et al.  Evolutionary Feature Construction Using Information Gain and Gini Index , 2004, EuroGP.

[14]  Erwin Kreyszig,et al.  Advanced Engineering Mathematics, Maple Computer Guide , 2000 .

[15]  F. W. Kellaway,et al.  Advanced Engineering Mathematics , 1969, The Mathematical Gazette.

[16]  E. Kreyszig,et al.  Advanced Engineering Mathematics. , 1974 .

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[19]  Lawrence Davis,et al.  Adapting Operator Probabilities in Genetic Algorithms , 1989, ICGA.

[20]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  W. Johnson,et al.  Advanced engineering mathematics: E. Kreyszig John Wiley. 856 pp., 79s , 1963 .

[22]  Krzysztof Krawiec,et al.  Coevolutionary Construction of Features for Transformation of Representation in Machine Learning , 2002 .