论文信息 - Review and Evaluation of Feature Selection Algorithms in Synthetic Problems

Review and Evaluation of Feature Selection Algorithms in Synthetic Problems

Abstract: The main purpose of Feature Subset Selection is to ﬁnd a reduced subsetof attributes from a data set described by a feature set. The task of a featureselectionalgorithm (FSA) is to provide with a computational solution motivated by a certaindeﬁnition of relevance or by a reliable evaluation measure. In this paper several funda-mental algorithms are studied to assess their performance in a controlled experimentalscenario. A measure to evaluate FSAs is devised that computes the degree of matchingbetween the output given by a FSA and the known optimal solutions. An extensiveexperimental study on synthetic problems is carried out to assess the behaviour ofthe algorithms in terms of solution accuracy and size as a function of the relevance,irrelevance, redundancy and size of the data samples. The controlled experimentalconditions facilitate the derivation of better-supported and meaningful conclusions.Keywords:Feature Selection Algorithms; Empirical Evaluations; Attribute relevanceand redundancy.1 INTRODUCTION

L. A. Belanche | F. F. Gonz'alez | L. Belanche | F. F. Gonz'alez

[1] Mineichi Kudo,et al. A comparative evaluation of medium- and large-scale feature selectors for pattern classifiers , 1998, Kybernetika.

[2] Juha Reunanen,et al. Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[3] David W. Aha,et al. A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[4] Larry A. Rendell,et al. A Practical Approach to Feature Selection , 1992, ML.

[5] Sebastian Thrun,et al. The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[6] Anil K. Jain,et al. Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[8] Bruce A. Draper,et al. Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9] Mark A. Hall,et al. Correlation-based Feature Selection for Machine Learning , 2003 .

[10] Thomas G. Dietterich,et al. Learning with Many Irrelevant Features , 1991, AAAI.

[11] Josef Kittler,et al. Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[12] Huan Liu,et al. A Monotonic Measure for Optimal Feature Selection , 1998, ECML.

[13] Hiroshi Motoda,et al. Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[14] Ron Kohavi,et al. Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[15] Huan Liu,et al. Incremental Feature Selection , 1998, Applied Intelligence.

[16] Huan Liu. Scalable Feature Selection for Large Sized Databases , 1998 .

[17] Huan Liu,et al. Feature Selection for Classification , 1997, Intell. Data Anal..

[18] Huan Liu,et al. Consistency Based Feature Selection , 2000, PAKDD.

[19] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[20] Keinosuke Fukunaga,et al. A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[21] Rich Caruana,et al. Greedy Attribute Selection , 1994, ICML.

[22] Huan Liu,et al. Hybrid Search of Feature Subsets , 1998, PRICAI.

[23] Justin Doak,et al. An evaluation of feature selection methods and their application to computer security , 1992 .

[24] Patrick Henry Winston,et al. Learning structural descriptions from examples , 1970 .

[25] S. Vera,et al. Induction of Concepts in the Predicate Calculus , 1975, IJCAI.

[26] Moshe Ben-Bassat,et al. 35 Use of distance measures, information measures and error bounds in feature evaluation , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[27] Igor Kononenko,et al. Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.