Empirical minimization

We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomorphic coordinate projections. We then show that a direct analysis of the empirical minimization algorithm yields a significantly better bound, and that the estimates we obtain are essentially sharp. The method of proof we use is based on Talagrand's concentration inequality for empirical processes.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  S. Geer A New Approach to Least-Squares Estimation, with Applications , 1986 .

[3]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[4]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[5]  David Haussler,et al.  Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.

[6]  Peter L. Bartlett,et al.  Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[7]  A. V. D. Vaart Asymptotic Statistics: Delta Method , 1998 .

[8]  LpWilliam B. Johnson,et al.  Finite dimensional subspaces of , 2000 .

[9]  V. Koltchinskii,et al.  Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.

[10]  P. Massart Some applications of concentration inequalities to statistics , 2000 .

[11]  S. Geer Empirical Processes in M-Estimation , 2000 .

[12]  P. Massart,et al.  About the constants in Talagrand's concentration inequalities for empirical processes , 2000 .

[13]  Vladimir Koltchinskii,et al.  Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.

[14]  M. Ledoux The concentration of measure phenomenon , 2001 .

[15]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[16]  E. Rio,et al.  Inégalités de concentration pour les processus empiriques de classes de parties , 2001 .

[17]  Shahar Mendelson,et al.  Rademacher averages and phase transitions in Glivenko-Cantelli classes , 2002, IEEE Trans. Inf. Theory.

[18]  Shahar Mendelson,et al.  Improving the sample complexity using global data , 2002, IEEE Trans. Inf. Theory.

[19]  Shahar Mendelson,et al.  A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[20]  O. Bousquet Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms , 2002 .

[21]  Thierry Klein Une inégalité de concentration à gauche pour les processus empiriques , 2002 .

[22]  Peter L. Bartlett,et al.  Localized Rademacher Complexities , 2002, COLT.

[23]  Michael I. Jordan,et al.  Large Margin Classifiers: Convex Loss, Low Noise, and Convergence Rates , 2003, NIPS.

[24]  Shahar Mendelson,et al.  On the Performance of Kernel Classes , 2003, J. Mach. Learn. Res..

[25]  Jon A. Wellner,et al.  Ratio Limit Theorems for Empirical Processes , 2003 .

[26]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[27]  S. Mendelson Geometric Parameters in Learning Theory , 2004 .

[28]  G. Lugosi,et al.  Complexity regularization via localized random penalties , 2004, math/0410091.

[29]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[30]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.