Feature selection based on extreme learning machine

Feature selection (FE) is a crucial pre-processing in pattern classification. FE addresses the problem of finding the most compact and informative subset of initial feature set to improve the performance of pattern classification system or to reduce the storage requirement. Recently, Yang et al. proposed a wrapper-based feature selection method for multilayer perceptron (MLP) neural networks. The learning speed of the algorithm is very slow, especially for large database, due to iteratively tuning the weight parameters of the networks with back propagation algorithm. In order to deal with this problem, based on extreme learning machine (ELM), we propose a feature selection algorithm which uses a feature ranking criterion to measure the significance of a feature by computing the aggregate difference of the outputs of the probabilistic SLFN with and without the feature. The SLFN is trained with ELM which randomly chooses the weights of hidden layer and analytically determines the weights of output layer. We compared the proposed algorithm with the Yang's work and other three feature selection algorithms. The experimental results show that our proposed method is effective and efficient.

[1]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[2]  Chong Jin Ong,et al.  Feature selection via sensitivity analysis of SVM probabilistic outputs , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[3]  Antanas Verikas,et al.  Feature selection with neural networks , 2002, Pattern Recognit. Lett..

[4]  Huan Liu,et al.  Neural-network feature selector , 1997, IEEE Trans. Neural Networks.

[5]  Deniz Erdogmus,et al.  Feature selection in MLPs and SVMs based on maximum output information , 2004, IEEE Transactions on Neural Networks.

[6]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Jason Weston,et al.  Embedded Methods , 2006, Feature Extraction.

[9]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[10]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[11]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[12]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[13]  Tommy W. S. Chow,et al.  Effective feature selection scheme using mutual information , 2005, Neurocomputing.

[14]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[16]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[17]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[18]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[19]  Jian-Bo Yang,et al.  Feature Selection for MLP Neural Network: The Use of Random Permutation of Probabilistic Outputs , 2009, IEEE Transactions on Neural Networks.

[20]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[21]  D. Serre Matrices: Theory and Applications , 2002 .

[22]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[23]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[24]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..