The ensemble approach to neural-network learning and generalization

In this paper a new method is suggested for learning and generalization with a general one-hidden layer feedforward neural network. This scheme encompasses the use of a linear combination of heterogeneous nodes having randomly prescribed parameter values. The learning of the parameters is realized through adaptive stochastic optimization using a generalization data set. The learning of the linear coefficients in the linear combination of nodes is achieved with a linear regression method using data from the training set. One node is learned at a time. The method allows for choosing the proper number of net nodes, and is computationally efficient. The method was tested on mathematical examples and real problems from materials science and technology.

[1]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[2]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[3]  A. Barron Approximation and Estimation Bounds for Artificial Neural Networks , 1991, COLT '91.

[4]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[5]  Lorien Y. Pratt,et al.  Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[6]  Andrew R. Barron,et al.  Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[7]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[8]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[9]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[10]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[11]  Martin Pincus,et al.  Letter to the Editor - A Monte Carlo Method for the Approximate Solution of Certain Types of Constrained Optimization Problems , 1970, Oper. Res..

[12]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[13]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[14]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[15]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[16]  M. Stone Asymptotics for and against cross-validation , 1977 .

[17]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[18]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[19]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[20]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[21]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[22]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[23]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[24]  Y. Makovoz Random Approximants and Neural Networks , 1996 .

[25]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[26]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[27]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[28]  V. A. Morozov,et al.  Methods for Solving Incorrectly Posed Problems , 1984 .

[29]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[30]  Martin Pincus,et al.  Letter to the Editor - -A Closed Form Solution of Certain Programming Problems , 1968, Oper. Res..

[31]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[32]  Kurt Hornik,et al.  Some new results on neural network approximation , 1993, Neural Networks.

[33]  James D. Keeler,et al.  Layered Neural Networks with Gaussian Hidden Units as Universal Approximations , 1990, Neural Computation.