SIP-FS: a novel feature selection for data representation

Multiple features are widely used to characterize real-world datasets. It is desirable to select leading features with stability and interpretability from a set of distinct features for a comprehensive data description. However, most of existing feature selection methods focus on the predictability (e.g., prediction accuracy) of selected results yet neglect stability. To obtain compact data representation, a novel feature selection method is proposed to improve stability, and interpretability without sacrificing predictability (SIP-FS). Instead of mutual information, generalized correlation is adopted in minimal redundancy maximal relevance to measure the relation between different feature types. Several feature types (each contains a certain number of features) can then be selected and evaluated quantitatively to determine what types contribute to a specific class, thereby enhancing the so-called interpretability of features. Moreover, stability is introduced in the criterion of SIP-FS to obtain consistent results of ranking. We conduct experiments on three publicly available datasets using one-versus-all strategy to select class-specific features. The experiments illustrate that SIP-FS achieves significant performance improvements in terms of stability and interpretability with desirable prediction accuracy and indicates advantages over several state-of-the-art approaches.

[1]  Yun Li,et al.  Ensemble Feature Weighting Based on Local Learning and Diversity , 2012, AAAI.

[2]  Xudong Jiang,et al.  Sparse and Dense Hybrid Representation via Dictionary Decomposition for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Huiqing Liu,et al.  Discovery of significant rules for classifying cancer diagnosis data , 2003, ECCB.

[4]  Christopher M. Bishop,et al.  Pattern recognition and machine learning, 5th Edition , 2007, Information science and statistics.

[5]  Chris H. Q. Ding,et al.  Stable feature selection via dense feature groups , 2008, KDD.

[6]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[7]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[8]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[10]  Xi Chen,et al.  Measuring the Effectiveness of Various Features for Thematic Information Extraction From Very High Resolution Remote Sensing Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Jana Novovicová,et al.  Evaluating Stability and Comparing Output of Feature Selectors that Optimize Feature Subset Cardinality , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Sang-Woon Kim,et al.  On measuring confidence levels using multiple views of feature set for useful unlabeled data selection , 2016, Neurocomputing.

[13]  Jennie Si,et al.  FREL: A Stable Feature Selection Algorithm , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Mansour Jamzad,et al.  A feature fusion based localized multiple kernel learning system for real world image classification , 2017, EURASIP J. Image Video Process..

[15]  Matti Pietikäinen,et al.  Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[16]  Yue Han,et al.  Stable Gene Selection from Microarray Data via Sample Weighting , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Xi Chen,et al.  Supervised Multiview Feature Selection Exploring Homogeneity and Heterogeneity With $\ell_{1,2}$ -Norm and Automatic View Generation , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[19]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[20]  S. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007 .

[21]  Jean-Philippe Vert,et al.  The Influence of Feature Selection Methods on Accuracy, Stability and Interpretability of Molecular Signatures , 2011, PloS one.

[22]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[23]  Yue Han,et al.  A Variance Reduction Framework for Stable Feature Selection , 2010, 2010 IEEE International Conference on Data Mining.

[24]  Chin-Chuan Han,et al.  Vehicle color classification using manifold learning methods from urban surveillance videos , 2014, EURASIP J. Image Video Process..

[25]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[26]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[27]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[28]  Chris H. Q. Ding,et al.  Consensus group stable feature selection , 2009, KDD.

[29]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[30]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[31]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[32]  Hassan Ghassemian,et al.  A multiscale modified minimum spanning forest method for spatial-spectral hyperspectral images classification , 2017, EURASIP J. Image Video Process..

[33]  Alexandros Kalousis,et al.  Model mining for robust feature selection , 2012, KDD.

[34]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[35]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[36]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[37]  Wei Li,et al.  Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Tai Sing Lee,et al.  Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Sylvia Richardson,et al.  Statistical Applications in Genetics and Molecular Biology Comparing the Characteristics of Gene Expression Profiles Derived by Univariate and Multivariate Classification Methods , 2011 .

[41]  Yue Han,et al.  A Variance Reduction Framework for Stable Feature Selection , 2010, ICDM.