A stability index for feature selection

Sequential forward selection (SFS) is one of the most widely used feature selection procedures. It starts with an empty set and adds one feature at each step. The estimate of the quality of the candidate subsets usually depends on the training/testing split of the data. Therefore different sequences of features may be returned from repeated runs of SFS. A substantial discrepancy between such sequences will signal a problem with the selection. A stability index is proposed here based on cardinality of the intersection and a correction for chance. The experimental results with 10 real data sets indicate that the index can be useful for selecting the final feature subset. If stability is high, then we should return a subset of features based on their total rank across the SFS runs. If stability is low, then it is better to return the feature subset which gave the minimum error across all SFS runs.

[1]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[2]  E. Patrick,et al.  Fundamentals of Pattern Recognition , 1973 .

[3]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[4]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[5]  Ulisses Braga-Neto,et al.  Impact of error estimation on feature selection , 2005, Pattern Recognit..

[6]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[9]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[10]  Melanie Hilario,et al.  Stability of feature selection algorithms , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11]  Anil K. Jain,et al.  39 Dimensionality and sample size considerations in pattern recognition practice , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[12]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[13]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .