Feature Construction during Tree Learning

Genetic algorithms (GAs) are excellent for learning concepts that span complex space, especially those with a large number of local optima. Learning algorithms, in general, perform well on data that has been pre-processed to reduce complexity. Several studies have documented their effectiveness on raw as well as pre-processed data using feature selection, etc. Unlike other learning algorithms (e.g., those in feedforward neural networks), GAs are not particularly effective in reducing data complexity while learning difficult concepts. Feature construction has been shown to reduce complexity of space spanned by input data. In this paper, we present an algorithm for enhancing the learning process of a GA through the use of feature construction as a pre-processing step. We also apply the same procedure on two other learning methods, namely C4.5 and Lazy Learner, and show improvement in performance.

[1]  Herbert A. Simon,et al.  The Search for Regularity: Four Aspects of Scientific Discovery , 1984 .

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  Giulia Pagallo,et al.  Learning DNF by Decision Trees , 1989, IJCAI.

[4]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[5]  Larry A. Rendell,et al.  A Scheme for Feature Construction and a Comparison of Empirical Methods , 1991, IJCAI.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[8]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[9]  Larry A. Rendell,et al.  Integrating Feature Construction with Multiple Classifiers in Decision Tree Induction , 1997, ICML.

[10]  Stefan Wrobel,et al.  An Algorithm for Multi-relational Discovery of Subgroups , 1997, PKDD.

[11]  Ryszard S. Michalski,et al.  Data-Driven Constructive Induction , 1998, IEEE Intell. Syst..

[12]  Peter A. Flach,et al.  The role of feature construction in inductive rule learning , 2000 .

[13]  Paul E. Utgoff,et al.  Feature construction for game playing , 2001 .

[14]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[15]  Peter A. Flach,et al.  Confirmation-Guided Discovery of First-Order Rules with Tertius , 2004, Machine Learning.

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.