Complexity Theory of Neural Networks and Classification Problems

Multilayered, feedforward neural network techniques have been proposed for a variety of classification and recognition problems ranging from speech to sonar signal processing problems. It is generally assumed that the underlying application does not need to be modeled very much and that an artificial neural network solution can be obtained instead by training from empirical data with little or no a priori information about the application. We argue that the right network architecture is fundamental for a good solution to exist and the class of network architectures forms a basis for a complexity theory of classification problems. An abstraction of this notion of complexity leads to ideas similar to Kolmogorov's minimum length description criterion, entropy and k-widths. We will present some basic results on this measure of complexity. From this point of view, artificial neural network solutions to real engineering problems may not ameliorate the difficulties of classification problems, but rather obscure and postpone them. In particular, we doubt that the design of neural networks for solving interesting nontrivial engineering problems will be any easier than other large scale engineering design problems (such as in aerodynamics and semiconductor device modeling).

[1]  G. Lorentz METRIC ENTROPY, WIDTHS, AND SUPERPOSITIONS OF FUNCTIONS , 1962 .

[2]  R. Ash,et al.  Real analysis and probability , 1975 .

[3]  Kurt Hornik,et al.  FEED FORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS , 1989 .

[4]  David H. Sharp,et al.  Neural nets and artificial intelligence , 1989 .

[5]  S. M. Carroll,et al.  Construction of neural nets using the radon transform , 1989, International 1989 Joint Conference on Neural Networks.

[6]  David J. Burr,et al.  Experiments on neural net recognition of spoken and written text , 1988, IEEE Trans. Acoust. Speech Signal Process..

[7]  Terrence J. Sejnowski,et al.  Learned classification of sonar targets using a massively parallel network , 1988, IEEE Trans. Acoust. Speech Signal Process..

[8]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[9]  M. Buhmann Multivariate interpolation in odd-dimensional euclidean spaces using multiquadrics , 1990 .

[10]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[11]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[14]  L. Jones Constructive approximations for neural networks by sigmoidal functions , 1990, Proc. IEEE.

[15]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[16]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[17]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[18]  Geoffrey E. Hinton,et al.  A general framework for parallel distributed processing , 1986 .

[19]  David Haussler,et al.  Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension , 1986, STOC '86.

[20]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[21]  A. El-Jaroudi,et al.  Classification capabilities of two-layer neural nets , 1989, International Conference on Acoustics, Speech, and Signal Processing,.