Comparison of feature ranking methods based on information entropy

A comparison between five feature ranking methods based on entropy is presented on artificial and real datasets. Feature ranking method using /spl chi//sup 2/ statistics gives results that are very similar to the entropy-based methods. The quality of feature rankings obtained by these methods is evaluated using the decision tree and the nearest neighbor classifier with growing number of most important features. Significant differences are found in some cases, but there is no single best index that works best for all data and all classifiers. Therefore to be sure that a subset of features giving highest accuracy has been selected requires the use of many different indices.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[3]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[4]  Włodzisław Duch,et al.  Feature Ranking , Selection and Discretization , 2003 .

[5]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  Ramón López de Mántaras,et al.  A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.

[8]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[9]  D. V. Sridhar,et al.  Information theoretic subset selection for neural network models , 1998 .

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  J. M. Benítez,et al.  Advances in Soft Computing , 2003, Springer London.

[12]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[13]  Huan Liu,et al.  Improving backpropagation learning with feature selection , 1996, Applied Intelligence.

[14]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .