Feature Construction Methods : A Survey

A good feature representation is central to achieving high performance in any machine learning task. However manually defining a good feature set is often not feasible. Feature construction involves transforming a given set of input features to generate a new set of more powerful features which can then used for prediction. Several feature construction methods have been developed. In this paper we present a survey of past 20 years of research in the area. We describe the major issues involved and discuss the manner in which various methods deal with them. While our understanding of feature construction has grown significantly over the years, a number of open challenges continue to remain.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Kenneth DeJong,et al.  Genetic algorithms as a tool for restructuring feature space representations , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[3]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[4]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[5]  Alex Alves Freitas,et al.  A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets , 2004, Artif. Intell. Medicine.

[6]  James Allan,et al.  An interactive algorithm for asking and incorporating feature feedback into support vector machines , 2007, SIGIR.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[9]  Bhaskar Mukherjee,et al.  Journal of the American Society for Information Science and Technology (2000—2007): a bibliometric study , 2009 .

[10]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[11]  H. J. Mclaughlin,et al.  Learn , 2002 .

[12]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[13]  Susan T. Dumais,et al.  Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval , 2004, SIGIR 2004.

[14]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[15]  Tom M. Mitchell,et al.  Text clustering with extended user feedback , 2006, SIGIR.

[16]  Malcolm I. Heywood,et al.  A Linear Genetic Programming Approach to Intrusion Detection , 2003, GECCO.

[17]  Rayner Alfred,et al.  A Genetic-Based Feature Construction Method for Data Summarisation , 2008, ADMA.

[18]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[19]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[20]  Shaul Markovitch,et al.  Feature Generation Using General Constructor Functions , 2002, Machine Learning.

[21]  Fernando E. B. Otero,et al.  Genetic Programming for Attribute Construction in Data Mining , 2002, EuroGP.

[22]  Ashwin Srinivasan,et al.  An investigation into feature construction to assist word sense disambiguation , 2009, Machine Learning.

[23]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[24]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[25]  Jason Eisner,et al.  Modeling Annotators: A Generative Approach to Learning from Annotator Rationales , 2008, EMNLP.

[26]  Gerald DeJong,et al.  Explanation-Based Feature Construction , 2007, IJCAI.

[27]  Christine D. Piatko,et al.  Using “Annotator Rationales” to Improve Machine Learning for Text Categorization , 2007, NAACL.

[28]  Larry A. Rendell,et al.  A Scheme for Feature Construction and a Comparison of Empirical Methods , 1991, IJCAI.

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[30]  Giulia Pagallo,et al.  Learning DNF by Decision Trees , 1989, IJCAI.

[31]  Derek Sleeman,et al.  Proceedings of the Ninth International Workshop on Machine Learning , 1992 .

[32]  Ashwin Srinivasan,et al.  Word Sense Disambiguation Using Inductive Logic Programming , 2007, ILP.

[33]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[34]  Saso Dzeroski,et al.  Learning Nonrecursive Definitions of Relations with LINUS , 1991, EWSL.

[35]  Larry Bull,et al.  Improving the human readability of features constructed by genetic programming , 2007, GECCO '07.

[36]  Eduardo Pérez,et al.  MDL-based fitness for feature construction , 2007, GECCO '07.

[37]  Gideon S. Mann,et al.  Learning from labeled features using generalized expectation criteria , 2008, SIGIR '08.

[38]  Yves Kodratoff Proceedings of the European working session on learning on Machine learning , 1991 .

[39]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[40]  Dan Roth,et al.  Interactive Feature Space Construction using Semantic Information , 2009, CoNLL.