Learning and Convergence of the Normalized Radial Basis Functions Networks

In the paper we analyze convergence and rates of convergence of the normalized radial basis function networks by relating their \(L_2\) error to the \(L_2\) error of the Wolverton-Wagner regression estimate. The network parameters are learned by minimizing the empirical risk and are applied in function learning and classification.

[1]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[2]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[3]  A. Krzyżak,et al.  Convergence and rates of convergence of radial basis functions networks in function learning , 2001 .

[4]  Robert Shorten,et al.  Side effects of Normalising Radial Basis Function Networks , 1996, Int. J. Neural Syst..

[5]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[6]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[7]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[8]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[9]  Adam Krzyzak,et al.  Radial Basis Function Networks and Complexity Regularization in Function Learning , 2022 .

[10]  L. Devroye,et al.  On the L1 convergence of kernel estimators of regression functions with applications in discrimination , 1980 .

[11]  Miroslaw Pawlak,et al.  Necessary and sufficient conditions for Bayes risk consistency of a recursive kernel classification rule , 1987, IEEE Trans. Inf. Theory.

[12]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[13]  W. Greblicki,et al.  Fourier and Hermite series estimates of regression functions , 1985 .

[14]  András Faragó,et al.  Strong universal consistency of neural network classifiers , 1993, IEEE Trans. Inf. Theory.

[15]  Kurt Hornik,et al.  FEED FORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS , 1989 .

[16]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[17]  G. Lugosi,et al.  On the Strong Universal Consistency of Nearest Neighbor Regression Function Estimates , 1994 .

[18]  Adam Krzyzak,et al.  Nonparametric regression estimation by normalized radial basis function networks , 2003, IEEE Transactions on Information Theory.

[19]  L. Devroye,et al.  An equivalence theorem for L1 convergence of the kernel regression estimate , 1989 .

[20]  Gábor Lugosi,et al.  Nonparametric estimation via empirical risk minimization , 1995, IEEE Trans. Inf. Theory.

[21]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[22]  Adam Krzyżak,et al.  Nonparametric Regression Based on Hierarchical Interaction Models , 2017, IEEE Transactions on Information Theory.

[23]  L. Györfi,et al.  On the asymptotic normality of the L2-error in partitioning regression estimation , 1998 .

[24]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[25]  Jean-Philippe Vert,et al.  Consistency of Random Forests , 2014, 1405.2881.

[26]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[27]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[28]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[29]  Adam Krzyzak,et al.  Nonparametric estimation and classification using radial basis function nets and empirical risk minimization , 1996, IEEE Trans. Neural Networks.

[30]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[31]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[32]  Luc Devroye,et al.  Lectures on the Nearest Neighbor Method , 2015 .

[33]  Adam Krzyzak,et al.  Global convergence of the recursive kernel regression estimates with applications in classification and nonlinear system estimation , 1992, IEEE Trans. Inf. Theory.

[34]  T. Wagner,et al.  Asymptotically optimal discriminant functions for pattern classification , 1969, IEEE Trans. Inf. Theory.

[35]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[36]  Kazuoki Azuma WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .

[37]  Adam Krzyzak,et al.  On radial basis function nets and kernel regression: Statistical consistency, convergence rates, and receptive field size , 1994, Neural Networks.

[38]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[39]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[40]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[41]  Luc Devroye,et al.  Any Discrimination Rule Can Have an Arbitrarily Bad Probability of Error for Finite Sample Size , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[43]  J. Duchon Sur l’erreur d’interpolation des fonctions de plusieurs variables par les $D^m$-splines , 1978 .

[44]  Adam Krzyzak,et al.  Distribution-free consistency of a nonparametric kernel regression estimate and classification , 1984, IEEE Trans. Inf. Theory.

[45]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[46]  Adam Krzyzak,et al.  Convergence and Rates of Convergence of Recursive Radial Basis Functions Networks in Function Learning and Classification , 2017, ICAISC.

[47]  Adam Krzyzak,et al.  The rates of convergence of kernel regression estimates and classification rules , 1986, IEEE Trans. Inf. Theory.