Integrated feature architecture selection

In this paper, we present an integrated approach to feature and architecture selection for single hidden layer-feedforward neural networks trained via backpropagation. In our approach, we adopt a statistical model building perspective in which we analyze neural networks within a nonlinear regression framework. The algorithm presented in this paper employs a likelihood-ratio test statistic as a model selection criterion. This criterion is used in a sequential procedure aimed at selecting the best neural network given an initial architecture as determined by heuristic rules. Application results for an object recognition problem demonstrate the selection algorithm's effectiveness in identifying reduced neural networks with equivalent prediction accuracy.

[1]  Robert O. Winder,et al.  Enumeration of Seven-Argument Threshold Functions , 1965, IEEE Trans. Electron. Comput..

[2]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[3]  R. H. Myers Classical and modern regression with applications , 1986 .

[4]  Jeffrey W. Hoffmeister,et al.  Using neural networks to select wavelet features for breast cancer diagnosis , 1996 .

[5]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[6]  Steven K. Rogers,et al.  An Approach To Multiple Sensor Target Detection , 1989, Defense, Security, and Sensing.

[7]  Kenneth W. Bauer,et al.  Improved feature screening in feedforward neural networks , 1996, Neurocomputing.

[8]  Martin G. Bello,et al.  Enhanced training algorithms, and integrated training/architecture selection for multilayer perceptron networks , 1992, IEEE Trans. Neural Networks.

[9]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[10]  A. Gallant,et al.  Nonlinear Statistical Models , 1988 .

[11]  Kenneth W. Bauer,et al.  Determining input features for multilayer perceptrons , 1995, Neurocomputing.

[12]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[13]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[14]  Yoshio Hirose,et al.  Backpropagation algorithm which varies the number of hidden units , 1989, International 1989 Joint Conference on Neural Networks.

[15]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[16]  Robert Hecht-Nielsen,et al.  Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[17]  M. Gutierrez,et al.  Estimating hidden unit number for two-layer perceptrons , 1989, International 1989 Joint Conference on Neural Networks.

[18]  N. Mantel Why Stepdown Procedures in Variable Selection , 1970 .

[19]  H. White,et al.  An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks , 1989, International 1989 Joint Conference on Neural Networks.

[20]  Hecht-Nielsen Theory of the backpropagation neural network , 1989 .

[21]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[22]  David B. Fogel An information criterion for optimal neural network selection , 1991, IEEE Trans. Neural Networks.

[23]  Steven K. Rogers,et al.  Bayesian selection of important features for feedforward neural networks , 1993, Neurocomputing.

[24]  Jean M Steppe Feature and Model Selection in Feedforward Neural Networks , 1994 .

[25]  S. Y. Kung,et al.  An algebraic projection analysis for optimal hidden units size and learning rates in back-propagation learning , 1988, IEEE 1988 International Conference on Neural Networks.

[26]  Eduardo D. Sontag,et al.  Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..

[27]  Wright-Patterson Afb,et al.  Feature Selection Using a Multilayer Perceptron , 1990 .

[28]  Panos J. Antsaklis,et al.  A simple method to derive bounds on the size and to train multilayer neural networks , 1991, IEEE Trans. Neural Networks.

[29]  Yih-Fang Huang,et al.  Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.