A Comparative Evaluation of Sequential Feature Selection Algorithms

Several recent machine learning publications demonstrate the utility of using feature selection algorithms in supervised learning tasks. Among these, sequential feature selection algorithms are receiving attention. The most frequently studied variants of these algorithms are forward and backward sequential selection. Many studies on supervised learning with sequential feature selection report applications of these algorithms, but do not consider variants of them that might be more appropriate for some performance tasks. This paper reports positive empirical results on such variants, and argues for their serious consideration in similar learning tasks.

[1]  David W. Aha,et al.  Generalizing from Case studies: A Case Study , 1992, ML.

[2]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[3]  Pat Langley,et al.  Oblivious Decision Trees and Abstract Cases , 1994 .

[4]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[5]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[6]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[7]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[8]  Richard L. Bankert,et al.  Cloud Classification of AVHRR Imagery in Maritime Regions Using a Probabilistic Neural Network , 1994 .

[9]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[10]  Jan M. Van Campenhout,et al.  On the Possible Orderings in the Measurement Selection Problem , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[12]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[13]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[14]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[15]  Anthony N. Mucciardi,et al.  A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties , 1971, IEEE Transactions on Computers.

[16]  J H Challis,et al.  An examination of procedures for determining body segment attitude and position from noisy biomechanical data. , 1995, Medical engineering & physics.