论文信息 - Model Selection for Regularized Least-Squares Algorithm in Learning Theory

Model Selection for Regularized Least-Squares Algorithm in Learning Theory

AbstractWe investigate the problem of model selection for learning algorithms depending on a continuous parameter. We propose a model selection procedure based on a worst-case analysis and on a data-independent choice of the parameter. For the regularized least-squares algorithm we bound the generalization error of the solution by a quantity depending on a few known constants and we show that the corresponding model selection procedure reduces to solving a bias-variance problem. Under suitable smoothness conditions on the regression function, we estimate the optimal parameter as a function of the number of data and we prove that this choice ensures consistency of the algorithm.

[1] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.

[2] Lorenzo Rosasco,et al. Are Loss Functions All the Same? , 2004, Neural Computation.

[3] T. Poggio,et al. General conditions for predictivity in learning theory , 2004, Nature.

[4] Felipe Cucker,et al. Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..

[5] S. Smale,et al. ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[6] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.

[7] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[8] Federico Girosi,et al. Generalization bounds for function approximation from scattered noisy data , 1999, Adv. Comput. Math..

[9] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..

[10] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11] Grace Wahba,et al. Spline Models for Observational Data , 1990 .

[12] Ding-Xuan Zhou,et al. The covering number in learning theory , 2002, J. Complex..

[13] Tomaso A. Poggio,et al. Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[14] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .

[15] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .

[16] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .

[17] Shahar Mendelson,et al. A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[18] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[19] Tomaso A. Poggio,et al. Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[20] Tong Zhang,et al. Leave-One-Out Bounds for Kernel Methods , 2003, Neural Computation.

[21] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[22] Ingo Steinwart,et al. Consistency of support vector machines and other regularized kernel classifiers , 2005, IEEE Transactions on Information Theory.

[23] Peter L. Bartlett,et al. Model Selection and Error Estimation , 2000, Machine Learning.

[24] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[25] Dudley,et al. Real Analysis and Probability: Measurability: Borel Isomorphism and Analytic Sets , 2002 .