A Constructive Meta-Level Feature Selection Method based on Method Repositories

Feature selection is one of key issues related with data pre-processing of classification task in a data mining process. Although many efforts have been done to improve typical feature selection algorithms (FSAs), such as filter methods and wrapper methods, it is hard for just one FSA to manage its performances to various datasets. To above problems, we propose another way to support feature selection procedure, constructing proper FSAs to each given dataset. Here is discussed constructive metalevel feature selection that re-constructs proper FSAs with a method repository every given datasets, de-composing representative FSAs into methods. After implementing the constructive meta-level feature selection system, we show how constructive meta-level feature selection goes well with 34 UCI common data sets, comparing with typical FSAs on their accuracies. As the result, our system shows the high performance on accuracies with lower computational costs to construct a proper FSA to each given data set automatically.

[1]  Takahira Yamaguchi,et al.  A Methodology for Feature Selection Based on Dynamic Incremental Extension of Seed Features , 2002 .

[2]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Geoffrey Holmes,et al.  Benchmarking attribute selection techniques for data mining , 2000 .

[7]  Kenneth A. De Jong,et al.  Genetic algorithms as a tool for feature selection in machine learning , 1992, Proceedings Fourth International Conference on Tools with Artificial Intelligence TAI '92.

[8]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[9]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[10]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[11]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[12]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  George H. John Enhancements to the data mining process , 1997 .

[15]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[16]  João Gama,et al.  Cascade Generalization , 2000, Machine Learning.

[17]  Takahira Yamaguchi,et al.  Constructive Meta-learning with Machine Learning Method Repositories , 2004, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems.

[18]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[19]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[20]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[21]  Ingo Mierswa,et al.  A Flexible Platform for Knowledge Discovery Experiments: YALE – Yet Another Learning Environment , 2003 .

[22]  Abraham Bernstein,et al.  An Intelligent Assistant for the Knowledge Discovery Process , 2001, IJCAI 2001.

[23]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..