Structural Risk Minimization and Rademacher Complexity for Regression

The Structural Risk Minimization principle allows estimating the generalization ability of a learned hypothesis by measuring the com- plexity of the entire hypothesis class. Two of the most recent and effective complexity measures are the Rademacher Complexity and the Maximal Discrepancy, which have been applied to the derivation of generalization bounds for kernel classifiers. In this work, we extend their application to the regression framework.

[1]  Gábor Lugosi,et al.  Nonparametric estimation via empirical risk minimization , 1995, IEEE Trans. Inf. Theory.

[2]  Yong-Ping Zhao,et al.  Robust truncated support vector regression , 2010, Expert Syst. Appl..

[3]  Davide Anguita,et al.  The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers , 2011, NIPS.

[4]  S. Kutin Extensions to McDiarmid's inequality when dierences are bounded with high probability , 2002 .

[5]  V. Koltchinskii,et al.  Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.

[6]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[7]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[8]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[9]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[10]  Davide Anguita,et al.  In-sample model selection for Support Vector Machines , 2011, The 2011 International Joint Conference on Neural Networks.

[11]  Yunqian Ma,et al.  Comparison of Model Selection for Regression , 2003, Neural Computation.

[12]  Davide Anguita,et al.  Maximal Discrepancy vs. Rademacher Complexity for error estimation , 2011, ESANN.

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[15]  Davide Anguita,et al.  Maximal Discrepancy for Support Vector Machines , 2011, ESANN.