Approximation Error Bounds via Rademacher's Complexity

Approximation properties of some connectionistic models, commonly used to construct approximation schemes for optimization problems with multivariable functions as admissible solutions, are investigated. Such models are made up of linear combinations of computational units with adjustable parameters. The relationship between model complexity (number of computational units) and approximation error is investigated using tools from Statistical Learning Theory, such as Talagrand’s inequality, fat-shattering dimension, and Rademacher’s complexity. For some families of multivariable functions, estimates of the approximation accuracy of models with certain computational units are derived in dependence of the Rademacher’s complexities of the families. The estimates improve previously-available ones, which were expressed in terms of V C dimension and derived by exploiting union-bound techniques. The results are applied to approximation schemes with certain radial-basis-functions as computational units, for which it is shown that the estimates do not exhibit the curse of dimensionality with respect to the number of variables.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  N. Aronszajn,et al.  Theory of Bessel potentials. I , 1961 .

[3]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4]  R. Bellman Dynamic programming. , 1957, Science.

[5]  J. Cooper SINGULAR INTEGRALS AND DIFFERENTIABILITY PROPERTIES OF FUNCTIONS , 1973 .

[6]  H. Brezis Analyse fonctionnelle : théorie et applications , 1983 .

[7]  D. Pollard Empirical Processes: Theory and Applications , 1990 .

[8]  L. Jones A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .

[9]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[10]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[11]  Charles A. Micchelli,et al.  Dimension-independent bounds on the degree of approximation by neural networks , 1994, IBM J. Res. Dev..

[12]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[13]  F. Girosi Approximation Error Bounds That Use Vc-bounds 1 , 1995 .

[14]  H. N. Mhaskar,et al.  Neural Networks for Optimal Approximation of Smooth and Analytic Functions , 1996, Neural Computation.

[15]  Mathukumalli Vidyasagar,et al.  A Theory of Learning and Generalization , 1997 .

[16]  Noga Alon,et al.  Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.

[17]  Eduardo Sontag VC dimension of neural networks , 1998 .

[18]  B. Carl,et al.  Metric Entropy of Convex Hulls in Banach Spaces , 1999 .

[19]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[22]  S. Mendelson,et al.  Entropy and the combinatorial dimension , 2002, math/0203275.

[23]  M. Sanguineti,et al.  Approximating Networks and Extended Ritz Method for the Solution of Functional Optimization Problems , 2002 .

[24]  Shahar Mendelson,et al.  A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[25]  Marcello Sanguineti,et al.  Comparison of worst case errors in linear and neural network approximation , 2002, IEEE Trans. Inf. Theory.

[26]  Strongly elliptic operators for a plane wave diffraction problem in Bessel potential spaces. , 2002 .

[27]  Marcello Sanguineti,et al.  Learning with generalization capability by kernel methods of bounded complexity , 2005, J. Complex..

[28]  Marcello Sanguineti,et al.  Error Estimates for Approximate Optimization by the Extended Ritz Method , 2005, SIAM J. Optim..

[29]  Daniel A. Williams,et al.  EXTENDING GIROSI’S APPROXIMATION ESTIMATES FOR FUNCTIONS IN SOBOLEV SPACES VIA STATISTICAL LEARNING THEORY , 2005 .

[30]  M. Kon,et al.  APPROXIMATING FUNCTIONS IN REPRODUCING KERNEL HILBERT SPACES VIA STATISTICAL LEARNING THEORY , 2005 .