Heuristics for the selection of weights in sequential feed-forward neural networks: An experimental study

The selection of weights of the new hidden units for sequential feed-forward neural networks (FNNs) usually involves a non-linear optimization problem that cannot be solved analytically in the general case. A suboptimal solution is searched heuristically. Most models found in the literature choose the weights in the first layer that correspond to each hidden unit so that its associated output vector matches the previous residue as best as possible. The weights in the second layer can be either optimized (in a least-squares sense) or not. Several exceptions to the idea of matching the residue perform an (implicit or explicit) orthogonalization of the output vectors of the hidden units. In this case, the weights in the second layer are always optimized. An experimental study of the aforementioned approaches to select the weights for sequential FNNs is presented. Our results indicate that the orthogonalization of the output vectors of the hidden units outperforms the strategy of matching the residue, both for approximation and generalization purposes.

[1]  J. Cooper,et al.  Theory of Approximation , 1960, Mathematical Gazette.

[2]  Jenq-Neng Hwang,et al.  Regression modeling in back-propagation and projection pursuit learning , 1994, IEEE Trans. Neural Networks.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Dit-Yan Yeung,et al.  Use of bias term in projection pursuit learning improves approximation and convergence properties , 1996, IEEE Trans. Neural Networks.

[5]  Alan F. Murray,et al.  IEEE International Conference on Neural Networks , 1997 .

[6]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[7]  Tamás D. Gedeon,et al.  Exploring constructive cascade networks , 1999, IEEE Trans. Neural Networks.

[8]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[9]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[10]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[11]  Manfred M. Fischer,et al.  An Incremental Algorithm for Parallel Training of the Size and the Weights in a Feedforward Neural Network , 2004, Neural Processing Letters.

[12]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[13]  T. Ash,et al.  Dynamic node creation in backpropagation networks , 1989, International 1989 Joint Conference on Neural Networks.

[14]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[15]  Tong Zhang,et al.  A General Greedy Approximation Algorithm with Applications , 2001, NIPS.

[16]  Heinz Mühlenbein,et al.  Predictive Models for the Breeder Genetic Algorithm I. Continuous Parameter Optimization , 1993, Evolutionary Computation.

[17]  Andrew R. Barron,et al.  Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.

[18]  Radakovič The theory of approximation , 1932 .

[19]  Arthur Flexer,et al.  Statistical evaluation of neural networks experiments: Minimum requirements and current practice , 1994 .

[20]  James T. Kwok,et al.  Objective functions for training new hidden units in constructive neural networks , 1997, IEEE Trans. Neural Networks.

[21]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[22]  Timur Ash,et al.  Dynamic node creation in backpropagation networks , 1989 .

[23]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[24]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[25]  Kevin Warwick,et al.  Incremental Approximation by Neural Networks , 1998 .

[26]  Khashayar Khorasani,et al.  New training strategies for constructive neural networks with application to regression problems , 2004, Neural Networks.

[27]  L. Jones On a conjecture of Huber concerning the convergence of projection pursuit regression , 1987 .

[28]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[29]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[30]  Jenq-Neng Hwang,et al.  The cascade-correlation learning: a projection pursuit learning perspective , 1996, IEEE Trans. Neural Networks.

[31]  Alan F. Murray,et al.  Synaptic Rewiring for Topographic Map Formation , 2008, ICANN.

[32]  Khashayar Khorasani,et al.  Application of adaptive constructive neural networks to image compression , 2002, IEEE Trans. Neural Networks.

[33]  W. Lin,et al.  A Fault Classification Method by RBF Neural Network with OLS Learning Procedure , 2001, IEEE Power Engineering Review.

[34]  James T. Kwok,et al.  Experimental analysis of input weight freezing in constructive neural networks , 1993, IEEE International Conference on Neural Networks.

[35]  Pascal Vincent,et al.  Kernel Matching Pursuit , 2002, Machine Learning.

[36]  Enrique Romero,et al.  A sequential algorithm for feed-forward neural networks with optimal coefficients and interacting frequencies , 2006, Neurocomputing.