Feature weight estimation based on dynamic representation and neighbor sparse reconstruction

Abstract Relief-like algorithms have been widely used as feature selection to reduce the dimension of high-dimensional data which involves thousands of irrelevant variables because of their low computational cost and high accuracy. Classical Relief algorithms have not exactly shown the dynamic procedure that updates weight iteratively. This paper proposes an innovative feature weight estimation method, called dynamic representation and neighbor sparse reconstruction-based Relief (DRNSR-Relief). Similar to the classical Relief algorithms, the goal of DRNSR-Relief is to maximize the expected margin in the weighted feature space. A dynamic representation framework is introduced to show the dynamic relationship between the expected margin vector and the weight vector. To achieve better neighbor reconstruction, DRNSR-Relief decomposes a nonlinear problem into a set of locally linear ones through local hyperplane with l1 regularization and then estimates feature weights in a large margin framework. With the help of gradient ascent method, we can guarantee the convergence of DRNSR-Relief. To demonstrate the validity and the effectiveness of our formulation for feature selection in supervised learning, we perform extensive experiments on synthetic and real-world datasets. Experimental results indicate that DRNSR-Relief is very promising.

[1]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[2]  John Shawe-Taylor,et al.  Generalisation Error Bounds for Sparse Linear Classifiers , 2000, COLT.

[3]  Ting Wang,et al.  Kernel Sparse Representation-Based Classifier , 2012, IEEE Transactions on Signal Processing.

[4]  Yunming Ye,et al.  A feature group weighting method for subspace clustering of high-dimensional data , 2012, Pattern Recognit..

[5]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[6]  Yanqing Zhang,et al.  Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis , 2007, TCBB.

[7]  B. Park,et al.  Choice of neighbor order in nearest-neighbor classification , 2008, 0810.5276.

[8]  Fanhua Shang,et al.  Maximum margin multiple-instance feature weighting , 2014, Pattern Recognit..

[9]  Majid Komeili,et al.  Local Feature Selection for Data Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Dapeng Wu,et al.  Feature extraction through local learning , 2009 .

[14]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[15]  Luc Van Gool,et al.  Iterative Nearest Neighbors , 2015, Pattern Recognit..

[16]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[17]  Jiang-She Zhang,et al.  Label propagation through sparse neighborhood and its applications , 2012, Neurocomputing.

[18]  Edwin R. Hancock,et al.  Joint hypergraph learning and sparse regression for feature selection , 2017, Pattern Recognit..

[19]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[20]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[21]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[22]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[23]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[25]  Frédéric Magoulès,et al.  Feature Selection for Predicting Building Energy Consumption Based on Statistical Learning Method , 2012 .

[26]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[27]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[28]  Zhaohong Deng,et al.  Robust Relief-Feature Weighting, Margin Maximization, and Fuzzy Optimization , 2010, IEEE Transactions on Fuzzy Systems.

[29]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[30]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[31]  Li Zhang,et al.  On the sparseness of 1-norm support vector machines , 2010, Neural Networks.

[32]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[33]  Michael K. Ng,et al.  Feature weight estimation for gene selection: a local hyperlinear learning approach , 2014, BMC Bioinformatics.

[34]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Xiaoxing Liu,et al.  An Entropy-based gene selection method for cancer classification using microarray data , 2005, BMC Bioinformatics.

[36]  Li Zhang,et al.  Multiple SVM-RFE for multi-class gene selection on DNA Microarray data , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).