Feature Generation Using General Constructor Functions

Most classification algorithms receive as input a set of attributes of the classified objects. In many cases, however, the supplied set of attributes is not sufficient for creating an accurate, succinct and comprehensible representation of the target concept. To overcome this problem, researchers have proposed algorithms for automatic construction of features. The majority of these algorithms use a limited predefined set of operators for building new features. In this paper we propose a generalized and flexible framework that is capable of generating features from any given set of constructor functions. These can be domain-independent functions such as arithmetic and logic operators, or domain-dependent operators that rely on partial knowledge on the part of the user. The paper describes an algorithm which receives as input a set of classified objects, a set of attributes, and a specification for a set of constructor functions that contains their domains, ranges and properties. The algorithm produces as output a set of generated features that can be used by standard concept learners to create improved classifiers. The algorithm maintains a set of its best generated features and improves this set iteratively. During each iteration, the algorithm performs a beam search over its defined feature space and constructs new features by applying constructor functions to the members of its current feature set. The search is guided by general heuristic measures that are not confined to a specific feature representation. The algorithm was applied to a variety of classification problems and was able to generate features that were strongly related to the underlying target concepts. These features also significantly improved the accuracy achieved by standard concept learners, for a variety of classification problems.

[1]  Michael J. Shaw,et al.  Complex Concept Acquisition through Directed Search and Feature Caching , 1993, IJCAI.

[2]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[3]  Larry A. Rendell,et al.  Fringe-Like Feature Construction: A Comparative Study and a Unifying Scheme , 1991, ML.

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  Yuh-Jyh Hu,et al.  Generation of Attributes for Learning Algorithms , 1996, AAAI/IAAI, Vol. 1.

[6]  Larry A. Rendell,et al.  Using Multidimensional Projection to Find Relations , 1995, ICML.

[7]  M. Pazzani,et al.  ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees , 1991 .

[8]  Mark S. Boddy,et al.  Deliberation Scheduling for Problem Solving in Time-Constrained Environments , 1994, Artif. Intell..

[9]  Zijian Zheng,et al.  Constructing Nominal X-of-N Attributes , 1995, IJCAI.

[10]  Mark S. Boddy,et al.  Anytime Problem Solving Using Dynamic Programming , 1991, AAAI.

[11]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[12]  Ryszard S. Michalski,et al.  Hypothesis-Driven Constructive Induction in AQ17-HCI: A Method and Experiments , 1994, Machine Learning.

[13]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[14]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[15]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[16]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[17]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[18]  Carla E. Brodley,et al.  Linear Machine Decision Trees , 1991 .

[19]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[20]  Steven L. Salzberg,et al.  Improving Classification Methods via Feature Selection , 1992 .

[21]  Saso Dzeroski,et al.  Declarative Bias in Equation Discovery , 1997, International Conference on Machine Learning.

[22]  Richard S. Sutton,et al.  Learning Polynomial Functions by Feature Construction , 1991, ML.

[23]  Alberto L. Sangiovanni-Vincentelli,et al.  Constructive Induction Using a Non-Greedy Strategy for Feature Selection , 1992, ML.

[24]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[25]  David W. Aha,et al.  Incremental Constructive Induction: An Instance-Based Approach , 1991, ML.

[26]  Larry A. Rendell,et al.  Improving the Design of Induction Methods by Analyzing Algorithm Functionality and Data-Based Concept Complexity , 1993, IJCAI.

[27]  Jerzy W. Bala,et al.  The Principal Axes Method for Constructive Induction , 1992, ML.

[28]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[29]  Larry A. Rendell,et al.  Feature Construction in Structural Decision Trees , 1991, ML.

[30]  Michael Schlosser,et al.  Discovery of Relevant New Features by Generating Non-Linear Decision Trees , 1996, KDD.

[31]  J. C. Schlimmer,et al.  Concept acquisition through representational adjustment , 1987 .

[32]  Haym Hirsh,et al.  Bootstrapping Training-Data Representations for Inductive Learning: A Case Study in Molecular Biology , 1994, AAAI.

[33]  Kenneth A. De Jong,et al.  Using genetic algorithms for concept learning , 1993, Machine Learning.

[34]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.