Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions

It is well known that standard single-hidden layer feedforward networks (SLFNs) with at most N hidden neurons (including biases) can learn N distinct samples (x(i),t(i)) with zero error, and the weights connecting the input neurons and the hidden neurons can be chosen "almost" arbitrarily. However, these results have been obtained for the case when the activation function for the hidden neurons is the signum function. This paper rigorously proves that standard single-hidden layer feedforward networks (SLFNs) with at most N hidden neurons and with any bounded nonlinear activation function which has a limit at one infinity can learn N distinct samples (x(i),t(i)) with zero error. The previous method of arbitrarily choosing weights is not feasible for any SLFN. The proof of our result is constructive and thus gives a method to directly find the weights of the standard SLFNs with any such bounded nonlinear activation function as opposed to iterative training algorithms in the literature.

[1]  Vera Kurková,et al.  Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[2]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[3]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[4]  Yoshifusa Ito,et al.  Approximation of functions on a compact set by finite sums of a sigmoid function without scaling , 1991, Neural Networks.

[5]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[6]  Panos J. Antsaklis,et al.  A simple method to derive bounds on the size and to train multilayer neural networks , 1991, IEEE Trans. Neural Networks.

[7]  H. A. Babri,et al.  General approximation theorem on feedforward networks , 1997, Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat..

[8]  Yoshifusa Ito,et al.  Approximation of continuous functions on Rd by linear combinations of shifted rotations of a sigmoid function with and without scaling , 1992, Neural Networks.

[9]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[10]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[11]  Vladik Kreinovich,et al.  Arbitrary nonlinearity is sufficient to represent all functions by neural networks: A theorem , 1991, Neural Networks.

[12]  Tianping Chen,et al.  Approximation capability to functions of several variables, nonlinear functionals and operators by radial basis function neural networks , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[13]  Hong Chen,et al.  Approximation capability in C(R¯n) by multilayer feedforward networks and related problems , 1995, IEEE Trans. Neural Networks.

[14]  H. White,et al.  Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.

[15]  Chong-Ho Choi,et al.  Constructive neural networks with piecewise interpolation capabilities for function approximations , 1994, IEEE Trans. Neural Networks.

[16]  Eric B. Baum,et al.  On the capabilities of multilayer perceptrons , 1988, J. Complex..

[17]  Gary G. R. Green,et al.  Neural networks, approximation theory, and finite precision computation , 1995, Neural Networks.

[18]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[19]  H. White,et al.  There exists a neural network that does not make avoidable mistakes , 1988, IEEE 1988 International Conference on Neural Networks.

[20]  Yih-Fang Huang,et al.  Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[21]  Hong Chen,et al.  Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems , 1995, IEEE Trans. Neural Networks.