On generalization bounds, projection profile, and margin distribution

We study generalization properties of linear learning algorithms and develop a data dependent approach that is used to derive generalization bounds that depend on the margin distribution. Our method uses random projection techniques to allow the use of existing VC dimension bounds in the effective, lower, dimension of the data. Our bounds are tighter than existing bounds and (sometimes) give informative generalization bounds for real world, high dimensional problems.

[1]  Vladimir Vapnik Estimations of dependences based on statistical data , 1982 .

[2]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[3]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[4]  Nello Cristianini,et al.  Further results on the margin distribution , 1999, COLT '99.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[7]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[8]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[9]  Piotr Indyk,et al.  Algorithmic applications of low-distortion geometric embeddings , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[10]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[11]  Dana Ron,et al.  An experimental and theoretical comparison of model selection methods , 1995, COLT '95.

[12]  Dan Roth,et al.  Learning Coherent Concepts , 2001, ALT.

[13]  John Shawe-Taylor,et al.  Classification Accuracy Based on Observed Margin , 1998, Algorithmica.

[14]  Thore Graepel,et al.  A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work , 2000, NIPS.

[15]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[16]  Dan Roth,et al.  A Winnow-Based Approach to Context-Sensitive Spelling Correction , 1998, Machine Learning.