Comparison of worst case errors in linear and neural network approximation

Sets of multivariable functions are described for which worst case errors in linear approximation are larger than those in approximation by neural networks. A theoretical framework for such a description is developed in the context of nonlinear approximation by fixed versus variable basis functions. Comparisons of approximation rates are formulated in terms of certain norms tailored to sets of basis functions. The results are applied to perceptron networks.

[1]  Vladik Kreinovich,et al.  Estimates of the Number of Hidden Units and Variation with Respect to Half-Spaces , 1997, Neural Networks.

[2]  Charles A. Micchelli,et al.  Dimension-independent bounds on the degree of approximation by neural networks , 1994, IBM J. Res. Dev..

[3]  Marcello Sanguineti,et al.  Bounds on rates of variable-basis and neural-network approximation , 2001, IEEE Trans. Inf. Theory.

[4]  Eduardo D. Sontag,et al.  Rate of approximation results motivated by robust neural network learning , 1993, COLT '93.

[5]  Věra Kůrková,et al.  Dimension-Independent Rates of Approximation by Neural Networks , 1997 .

[6]  Ronald A. DeVore,et al.  A constructive theory for approximation by splines with an arbitrary sequence of knot sets , 1976 .

[7]  Paul C. Kainen,et al.  Approximation by neural networks is not continuous , 1999, Neurocomputing.

[8]  Marcello Sanguineti,et al.  Tightness of Upper Bounds on Rates of Neural-Network Approximation , 2001 .

[9]  Paul C. Kainen,et al.  Geometry and Topology of Continuous Best and Near Best Approximations , 2000 .

[10]  R. Courant Differential and Integral Calculus , 1935 .

[11]  H. White,et al.  There exists a neural network that does not make avoidable mistakes , 1988, IEEE 1988 International Conference on Neural Networks.

[12]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[13]  L. Jones A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .

[14]  I. Singer Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces , 1970 .

[15]  A. Friedman Foundations of modern analysis , 1970 .

[16]  Y. Makovoz Random Approximants and Neural Networks , 1996 .

[17]  George G. Lorentz,et al.  Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[18]  G. Lorentz Approximation of Functions , 1966 .

[19]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[20]  R. Bellman Dynamic programming. , 1957, Science.

[21]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[22]  Kevin Warwick,et al.  Computer Intensive Methods in Control and Signal Processing: The Curse of Dimensionality , 1997 .

[23]  R. A. Silverman,et al.  Introductory Real Analysis , 1972 .

[24]  Wojbor A. Woyczynski,et al.  Review: Ivan Singer, Best approximation in normed linear spaces by elements of linear subspaces , 1972 .

[25]  Kevin Warwick,et al.  Incremental Approximation by Neural Networks , 1998 .

[26]  G. Pisier Remarques sur un résultat non publié de B. Maurey , 1981 .

[27]  Paul C. Kainen,et al.  Utilizing Geometric Anomalies of High Dimension: When Complexity Makes Computation Easier , 1997 .

[28]  R. DeVore,et al.  Compression of wavelet decompositions , 1992 .

[29]  Vera Kurková,et al.  Rates of approximation of real-valued boolean functions by neural networks , 1998, ESANN.

[30]  A. Kolmogoroff,et al.  Uber Die Beste Annaherung Von Funktionen Einer Gegebenen Funktionenklasse , 1936 .

[31]  Y. Makovoz Uniform Approximation by Neural Networks , 1998 .

[32]  M. Sanguineti,et al.  Approximating Networks and Extended Ritz Method for the Solution of Functional Optimization Problems , 2002 .

[33]  Thomas Parisini,et al.  Nonlinear stabilization by receding-horizon neural regulators , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[34]  Leonid Gurvits,et al.  Approximation and Learning of Convex Superpositions , 1997, J. Comput. Syst. Sci..

[35]  A. Pinkus n-Widths in Approximation Theory , 1985 .

[36]  R. DeVore,et al.  Nonlinear Approximation by Trigonometric Sums , 1995 .

[37]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[38]  V. M. Tikhomirov On the Best Approximation of Functions of a Given Class , 1991 .

[39]  F. Girosi Approximation Error Bounds That Use Vc-bounds 1 , 1995 .