A new wrapper feature selection approach using neural network

This paper presents a new feature selection (FS) algorithm based on the wrapper approach using neural networks (NNs). The vital aspect of this algorithm is the automatic determination of NN architectures during the FS process. Our algorithm uses a constructive approach involving correlation information in selecting features and determining NN architectures. We call this algorithm as constructive approach for FS (CAFS). The aim of using correlation information in CAFS is to encourage the search strategy for selecting less correlated (distinct) features if they enhance accuracy of NNs. Such an encouragement will reduce redundancy of information resulting in compact NN architectures. We evaluate the performance of CAFS on eight benchmark classification problems. The experimental results show the essence of CAFS in selecting features with compact NN architectures.

[1]  Shigeo Abe,et al.  Modified backward feature selection by cross validation , 2005, ESANN.

[2]  Yamashita,et al.  Backpropagation algorithm which varies the number of hidden units , 1989 .

[3]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[4]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Josep M. Sopena,et al.  Performing Feature Selection With Multilayer Perceptrons , 2008, IEEE Transactions on Neural Networks.

[6]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[7]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[8]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[9]  Alain Rakotomamonjy,et al.  Variable Selection Using SVM-based Criteria , 2003, J. Mach. Learn. Res..

[10]  Marcel J. T. Reinders,et al.  Random subspace method for multivariate feature selection , 2006, Pattern Recognit. Lett..

[11]  Jukka Saarinen,et al.  Feature selection method using neural network , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[12]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[13]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[14]  Ahmed Al-Ani,et al.  Feature Subset Selection Using Ant Colony Optimization , 2008 .

[15]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[16]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[18]  Jun Liu,et al.  An Incremental Approach to Contribution-Based Feature Selection , 2004 .

[19]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[20]  Mikko Lehtokangas,et al.  Modified cascade-correlation learning for classification , 2000, IEEE Trans. Neural Networks Learn. Syst..

[21]  Nikhil R. Pal,et al.  A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification , 2004, IEEE Transactions on Neural Networks.

[22]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[23]  Chun-Nan Hsu,et al.  The ANNIGMA-wrapper approach to fast feature selection for neural nets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[25]  Eduardo Gasca,et al.  Eliminating redundancy and irrelevance using a new MLP-based feature selection method , 2006, Pattern Recognit..

[26]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[27]  James T. Kwok,et al.  Objective functions for training new hidden units in constructive neural networks , 1997, IEEE Trans. Neural Networks.

[28]  Kazuyuki Murase,et al.  A new algorithm to design compact two-hidden-layer artificial neural networks , 2001, Neural Networks.

[29]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[31]  Krzysztof Michalak,et al.  Correlation-based Feature Selection Strategy in Neural Classification , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[32]  Manoranjan Dash,et al.  Feature Selection for Clustering , 2009, Encyclopedia of Database Systems.

[33]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[34]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[35]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[36]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[37]  Huan Liu,et al.  Neural-network feature selector , 1997, IEEE Trans. Neural Networks.

[38]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[39]  Antanas Verikas,et al.  Feature selection with neural networks , 2002, Pattern Recognit. Lett..

[40]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[41]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[42]  Paul E. Utgoff,et al.  Randomized Variable Elimination , 2002, J. Mach. Learn. Res..

[43]  Marco Richeldi,et al.  ADHOC: a tool for performing effective feature selection , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.

[44]  James L. McClelland,et al.  An Introduction to Linear Algebra in Parallel Distributed Processing , 1987 .

[45]  Feng Chu,et al.  A General Wrapper Approach to Selection of Class-Dependent Features , 2008, IEEE Transactions on Neural Networks.

[46]  Mikko Lehtokangas Modelling with constructive backpropagation , 1999, Neural Networks.

[47]  Deniz Erdogmus,et al.  Feature selection in MLPs and SVMs based on maximum output information , 2004, IEEE Transactions on Neural Networks.

[48]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..