Evaluation of mixed-valued features Via set cover criteria

Traditional feature evaluation methods, such as information gain, entropy and mutual information, generally evaluate the discriminating power of individual features independently based upon a vast varieties of metrics, referred as the TopK approach. Though a few feature evaluation methods, such as wrappers and criterion function, evaluate the discriminating power of a subset of features instead, they are usually either based upon a heuristic scheme or suffer a burden of high computational cost. As a result, when applied for multi-class classification on large data sets, existing feature evaluation methods either suffer the “siren pitfall” of a surplus of discriminating features for some classes while lack of discriminating features for the remaining classes, or become inapplicable due to the problems of repeatability and computational cost. Specifically, when applied for multiclass classification, the TopK approach overweighs individual discriminating features while lack the concern of their collective discrimination, and the optimal feature subsets discovered by wrapper's method are influenced by the corresponding classifier, lack of repeatability, let alone a rather high computation cost. In this paper, we propose an effective feature evaluation method for mixed-valued data sets via set cover criteria. Our set cover feature evaluation method gains several advantages in addressing the “siren pitfall” problem: its feature selection scheme is more robust and relies on little prior knowledge, its feature evaluation process is repeatable and the computational cost is rather low. In addition to that, the set cover method is applicable on mixed-valued data sets and able to weigh the discriminating power of features quantificationally. Experimental results indicate the effectiveness of our set cover method1.

[1]  Heloisa A. Camargo,et al.  Fuzzy Feature Subset Selection Using the Wang & Mendel Method , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[2]  Lorenzo Bruzzone,et al.  A technique for feature selection in multiclass problems , 2000 .

[3]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[4]  Philip H. Swain,et al.  Remote Sensing: The Quantitative Approach , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jordi Vitrià,et al.  Feature Subset Selection in an ICA Space , 2002, CCIA.

[6]  Jie Yang,et al.  Feature Selection for Multi-class Problems Using Support Vector Machines , 2004, PRICAI.

[7]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[8]  Bo Jin,et al.  Support vector machines with evolutionary feature weights optimization for biomedical data classification , 2005, NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society.

[9]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[10]  Trevor P. Martin,et al.  Feature Subset Selection Using a Fuzzy Method , 2009, 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics.

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  David S. Johnson,et al.  Approximation algorithms for combinatorial problems , 1973, STOC.

[13]  George Forman,et al.  A pitfall and solution in multi-class feature selection for text classification , 2004, ICML.

[14]  David L. Waltz,et al.  Trading MIPS and memory for knowledge engineering , 1992, CACM.

[15]  S. Salzberg A nearest hyperrectangle learning method , 2004, Machine Learning.

[16]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[17]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[18]  I. L. Thomas,et al.  Review Article A review of multi-channel indices of class separability , 1987 .