Variable selection using neural-network models

Abstract In this paper we propose an approach to variable selection that uses a neural-network model as the tool to determine which variables are to be discarded. The method performs a backward selection by successively removing input nodes in a network trained with the complete set of variables as inputs. Input nodes are removed, along with their connections, and remaining weights are adjusted in such a way that the overall input–output behavior learnt by the network is kept approximately unchanged. A simple criterion to select input nodes to be removed is developed. The proposed method is tested on a famous example of system identification. Experimental results show that the removal of input nodes from the neural network model improves its generalization ability. In addition, the method compares favorably with respect to other feature reduction methods.

[1]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[2]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[3]  Å. Björck,et al.  Accelerated projection methods for computing pseudoinverse solutions of systems of linear equations , 1979 .

[4]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[5]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[6]  R. DeMori,et al.  Handbook of pattern recognition and image processing , 1986 .

[7]  John E. Moody,et al.  Fast Pruning Using Principal Components , 1993, NIPS.

[8]  Donald E. Brown,et al.  Fast generic selection of features for neural network classifiers , 1992, IEEE Trans. Neural Networks.

[9]  Anil K. Jain,et al.  Parsimonious network design and feature selection through node pruning , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[10]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[11]  Ehud D. Karnin,et al.  A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.

[12]  Michio Sugeno,et al.  A fuzzy-logic-based approach to qualitative modeling , 1993, IEEE Trans. Fuzzy Syst..

[13]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[14]  Patrick Gallinari,et al.  Variable Selection with Optimal Cell Damage , 1994 .

[15]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[17]  K. Murase,et al.  A backpropagation algorithm which automatically determines the number of association units , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[18]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[19]  J.-S.R. Jang,et al.  Input selection for ANFIS learning , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[20]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[21]  Giovanna Castellano,et al.  Pruning in Recurrent Neural Networks , 1994 .

[22]  Giovanna Castellano,et al.  An iterative pruning algorithm for feedforward neural networks , 1997, IEEE Trans. Neural Networks.

[23]  Kenneth W. Bauer,et al.  Improved feature screening in feedforward neural networks , 1996, Neurocomputing.