Margin-Sparsity Trade-Off for the Set Covering Machine

We propose a new learning algorithm for the set covering machine and a tight data-compression risk bound that the learner can use for choosing the appropriate tradeoff between the sparsity of a classifier and the magnitude of its separating margin.

[1]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[2]  Manfred K. Warmuth,et al.  Relating Data Compression and Learnability , 2003 .

[3]  John Shawe-Taylor,et al.  Generalisation Error Bounds for Sparse Linear Classifiers , 2000, COLT.

[4]  Bernhard Schölkopf,et al.  A Compression Approach to Support Vector Model Selection , 2004, J. Mach. Learn. Res..

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  John Langford,et al.  PAC-MDL Bounds , 2003, COLT.

[7]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[8]  John Shawe-Taylor,et al.  The Set Covering Machine , 2003, J. Mach. Learn. Res..

[9]  Mario Marchand,et al.  Learning with Decision Lists of Data-Dependent Features , 2005, J. Mach. Learn. Res..

[10]  Thore Graepel,et al.  From Margin to Sparsity , 2000, NIPS.

[11]  S. Ben-David,et al.  Combinatorial Variability of Vapnik-chervonenkis Classes with Applications to Sample Compression Schemes , 1998, Discrete Applied Mathematics.

[13]  Manfred K. Warmuth,et al.  Sample compression, learnability, and the Vapnik-Chervonenkis dimension , 1995, Machine Learning.

[14]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[15]  Shahar Mendelson,et al.  Rademacher averages and phase transitions in Glivenko-Cantelli classes , 2002, IEEE Trans. Inf. Theory.

[16]  Kristin P. Bennett,et al.  Combining support vector and mathematical programming methods for classification , 1999 .

[17]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..