The wellposedness analysis of the kernel adaline

In this paper, we investigate the wellposedness of the kernel adaline. The kernel adaline finds the linear coefficients in a radial basis function network using deterministic gradient descent. We will show that the gradient descent provides an inherent regularization as long as the training is properly early-stopped. Along with other popular regularization techniques, this result is investigated in a unifying regularization-function concept. This understanding provides an alternative and possibly simpler way to obtain regularized solutions comparing with the cross-validation approach in regularization networks.

[1]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[2]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[3]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[4]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[5]  Gene H. Golub,et al.  Matrix computations , 1983 .

[6]  Sarunas Raudys,et al.  Regularization by Early Stopping in Single Layer Perceptron Training , 1996, ICANN.

[7]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[8]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[9]  Katsuyuki Hagiwara,et al.  Regularization learning and early stopping in linear networks , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[10]  Robert F. Harrison,et al.  A kernel based adaline , 1999, ESANN.

[11]  Gene H. Golub,et al.  Regularization by Truncated Total Least Squares , 1997, SIAM J. Sci. Comput..

[12]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[13]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[14]  John Hallam,et al.  IEEE International Joint Conference on Neural Networks , 2005 .

[15]  J. Hadamard Sur les problemes aux derive espartielles et leur signification physique , 1902 .

[16]  F. Girosi,et al.  Nonlinear prediction of chaotic time series using support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[17]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[18]  W. Rudin Principles of mathematical analysis , 1964 .

[19]  From Clocks to Chaos: The Rhythms of Life , 1988 .