论文信息 - On the optimality of neural-network approximation using incremental algorithms - 字舞流文

On the optimality of neural-network approximation using incremental algorithms

The problem of approximating functions by neural networks using incremental algorithms is studied. For functions belonging to a rather general class, characterized by certain smoothness properties with respect to the L2 norm, we compute upper bounds on the approximation error where error is measured by the Lq norm, 1< or =q< or =infinity. These results extend previous work, applicable in the case q=2, and provide an explicit algorithm to achieve the derived approximation error rate. In the range q< or =2 near-optimal rates of convergence are demonstrated. A gap remains, however, with respect to a recently established lower bound in the case q>2, although the rates achieved are provably better than those obtained by optimal linear approximation. Extensions of the results from the L2 norm to Lp are also discussed. A further interesting conclusion from our results is that no loss of generality is suffered using networks with positive hidden-to-output weights. Moreover, explicit bounds on the size of the hidden-to-output weights are established, which are sufficient to guarantee the established convergence rates.

Ron Meir | Vitaly Maiorov | R. Meir | V. Maiorov

[1] H. Müller,et al. Local Polynomial Modeling and Its Applications , 1998 .

[2] M. Nikolskii,et al. Approximation of Functions of Several Variables and Embedding Theorems , 1971 .

[3] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[4] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[5] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[6] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[7] H. Triebel. Theory Of Function Spaces , 1983 .

[8] John M. Danskin,et al. Approximation of functions of several variables and imbedding theorems , 1975 .

[9] H. N. Mhaskar,et al. Neural Networks for Optimal Approximation of Smooth and Analytic Functions , 1996, Neural Computation.

[10] S. Mallat. A wavelet tour of signal processing , 1998 .

[11] A. Pinkus. n-Widths in Approximation Theory , 1985 .

[12] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.

[13] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[14] S. Nikol,et al. Approximation of Functions of Several Variables and Imbedding Theorems , 1975 .

[15] Ron Meir,et al. On the near optimality of the stochastic approximation of smooth functions by neural networks , 2000, Adv. Comput. Math..

[16] Mathukumalli Vidyasagar,et al. A Theory of Learning and Generalization , 1997 .

[17] Bernard Delyon,et al. Accuracy analysis for wavelet approximations , 1995, IEEE Trans. Neural Networks.

[18] I. Johnstone,et al. Wavelet Shrinkage: Asymptopia? , 1995 .

[19] Vladimir N. Temlyakov,et al. The best m-term approximation and greedy algorithms , 1998, Adv. Comput. Math..

[20] C. D. Boor,et al. Spline approximation by quasiinterpolants , 1973 .

[21] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[22] P. Petrushev. Approximation by ridge functions and neural networks , 1999 .

[23] Joel Ratsaby,et al. On the Degree of Approximation by Manifolds of Finite Pseudo-Dimension , 1999 .

[24] G. Wahba. Spline models for observational data , 1990 .

[25] L. Jones. A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .

[26] R. DeVore,et al. Nonlinear approximation , 1998, Acta Numerica.

[27] Irwin W. Sandberg,et al. A note on error bounds for approximation in inner product spaces , 1996 .

[28] Peter Auer,et al. Exponentially many local minima for single neurons , 1995, NIPS.

[29] H. Triebel. Interpolation Theory, Function Spaces, Differential Operators , 1978 .

[30] V. Maiorov. On Best Approximation by Ridge Functions , 1999 .

[31] Irwin W. Sandberg,et al. Note on error bounds for function approximation using nonlinear networks , 1999 .

[32] Hong Chen,et al. Approximation capability in C(R¯n) by multilayer feedforward networks and related problems , 1995, IEEE Trans. Neural Networks.

[33] M. Solomjak,et al. Quantitative analysis in Sobolev imbedding theorems and applications to spectral theory , 1980 .

[34] Peter L. Bartlett,et al. Almost Linear VC-Dimension Bounds for Piecewise Polynomial Networks , 1998, Neural Computation.

[35] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[36] C. Darken,et al. Constructive Approximation Rates of Convex Approximation in Non-hilbert Spaces , 2022 .

[37] George G. Lorentz,et al. Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[38] A. Kufner,et al. Triebel, H., Interpolation Theory, Function Spaces, Differential Operators. Berlin, VEB Deutscher Verlag der Wissenschaften 1978. 528 S., M 87,50 , 1979 .