Geometric Parameters in Learning Theory

1. Introduction 2. Glivenko-Cantelli Classes and Learnability 2.1. The Classical Approach 2.2. Talagrand’s Inequality for Empirical Processes 3. Uniform Measures of Complexity 3.1. Metric Entropy and the Combinatorial Dimension 3.2. Random Averages and the Combinatorial Dimension 3.3. Phase Transitions in GC Classes 3.4. Concentration of the Combinatorial Dimension 4. Learning Sample Complexity and Error Bounds 4.1. Error Bounds 4.2. Comparing Structures 5. Estimating the Localized Averages 5.1. L2 Localized Averages 5.2. Data Dependent Bounds 5.3. Geometric Interpretation 6. Bernstein Type of L p Loss Classes 7. Classes of Linear Functionals 8. Concluding Remarks References

[1]  O. Hanner On the uniform convexity ofLp andlp , 1956 .

[2]  Norbert Sauer,et al.  On the Density of Families of Sets , 1972, J. Comb. Theory A.

[3]  S. Shelah A combinatorial problem; stability and order for models and theories in infinitary languages. , 1972 .

[4]  J. Hoffmann-jorgensen Sums of independent Banach space valued random variables , 1974 .

[5]  R. Dudley Central Limit Theorems for Empirical Measures , 1978 .

[6]  J. Kuelbs Probability on Banach spaces , 1978 .

[7]  Mark G. Karpovsky,et al.  Coordinate density of sets of vectors , 1978, Discret. Math..

[8]  E. Giné,et al.  Some Limit Theorems for Empirical Processes , 1984 .

[9]  J. Kahane Some Random Series of Functions , 1985 .

[10]  R. Dudley Universal Donsker Classes and Metric Entropy , 1987 .

[11]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[12]  N. Tomczak-Jaegermann Banach-Mazur distances and finite-dimensional operator ideals , 1989 .

[13]  J. Hoffmann-jorgensen,et al.  Probability in Banach Spaces 6 , 1990 .

[14]  E. Giné,et al.  GAUSSIAN CHARACTERIZATION OF UNIFORM DONSKER CLASSES OF FUNCTIONS , 1991 .

[15]  R. Dudley,et al.  Uniform and universal Glivenko-Cantelli classes , 1991 .

[16]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[17]  M. Talagrand Type, infratype and the Elton-Pajor theorem , 1992 .

[18]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[19]  David Haussler,et al.  Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.

[20]  Peter L. Bartlett,et al.  The importance of convexity in learning with squared loss , 1998, COLT '96.

[21]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[22]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[23]  P. Massart,et al.  From Model Selection to Adaptive Estimation , 1997 .

[24]  Mathukumalli Vidyasagar,et al.  A Theory of Learning and Generalization , 1997 .

[25]  中澤 真,et al.  Devroye, L., Gyorfi, L. and Lugosi, G. : A Probabilistic Theory of Pattern Recognition, Springer (1996). , 1997 .

[26]  Noga Alon,et al.  Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.

[27]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[28]  Peter L. Bartlett,et al.  The Importance of Convexity in Learning with Squared Loss , 1998, IEEE Trans. Inf. Theory.

[29]  Philip M. Long,et al.  Prediction, Learning, Uniform Convergence, and Scale-Sensitive Dimensions , 1998, J. Comput. Syst. Sci..

[30]  V. Koltchinskii Asymptotics of Spectral Projections of Some Random Matrices Approximating Integral Operators , 1998 .

[31]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[32]  B. Carl,et al.  Metric Entropy of Convex Hulls in Banach Spaces , 1999 .

[33]  R. Dudley,et al.  Uniform Central Limit Theorems: Notation Index , 2014 .

[34]  S. Boucheron,et al.  A sharp concentration inequality with applications , 1999, Random Struct. Algorithms.

[35]  V. Koltchinskii,et al.  Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.

[36]  P. Massart,et al.  About the constants in Talagrand's concentration inequalities for empirical processes , 2000 .

[37]  V. Koltchinskii,et al.  Random matrix approximation of spectra of integral operators , 2000 .

[38]  S. Boucheron,et al.  A sharp concentration inequality with applications , 1999, Random Struct. Algorithms.

[39]  E. Berger UNIFORM CENTRAL LIMIT THEOREMS (Cambridge Studies in Advanced Mathematics 63) By R. M. D UDLEY : 436pp., £55.00, ISBN 0-521-46102-2 (Cambridge University Press, 1999). , 2001 .

[40]  M. Ledoux The concentration of measure phenomenon , 2001 .

[41]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[42]  Shahar Mendelson,et al.  On the Size of Convex Hulls of Small Sets , 2002, J. Mach. Learn. Res..

[43]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[44]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[45]  Fuchang Gao Metric entropy of convex hulls , 2001 .

[46]  Shahar Mendelson,et al.  Rademacher averages and phase transitions in Glivenko-Cantelli classes , 2002, IEEE Trans. Inf. Theory.

[47]  Ding-Xuan Zhou,et al.  The covering number in learning theory , 2002, J. Complex..

[48]  Dmitry Panchenko,et al.  Some Local Measures of Complexity of Convex Hulls and Generalization Bounds , 2002, COLT.

[49]  Shahar Mendelson,et al.  Geometric Parameters of Kernel Machines , 2002, COLT.

[50]  S. Mendelson,et al.  Entropy and the combinatorial dimension , 2002, math/0203275.

[51]  Shahar Mendelson,et al.  Improving the sample complexity using global data , 2002, IEEE Trans. Inf. Theory.

[52]  Gábor Lugosi,et al.  Pattern Classification and Learning Theory , 2002 .

[53]  Shahar Mendelson,et al.  A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[54]  P. MassartLedoux,et al.  Concentration Inequalities Using the Entropy Method , 2002 .

[55]  Shahar Mendelson,et al.  Learnability in Hilbert Spaces with Reproducing Kernels , 2002, J. Complex..

[56]  O. Bousquet A Bennett concentration inequality and its application to suprema of empirical processes , 2002 .

[57]  Peter L. Bartlett,et al.  Localized Rademacher Complexities , 2002, COLT.

[58]  S. Boucheron,et al.  Concentration inequalities using the entropy method , 2003 .

[59]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[60]  S. Mendelson,et al.  Remarks on the geometry of coordinate projections in ℝn , 2003, math/0306314.

[61]  Michel Talagrand,et al.  Vapnik--Chervonenkis type conditions and uniform Donsker classes of functions , 2003 .

[62]  Gideon Schechtman,et al.  The shattering dimension of sets of linear functionals , 2004 .

[63]  H. Hanche-Olsen On the uniform convexity of L^p , 2005, math/0502021.