Access to Unlabeled Data can Speed up Prediction Time

Semi-supervised learning (SSL) addresses the problem of training a classifier using a small number of labeled examples and many un-labeled examples. Most previous work on SSL focused on how availability of unlabeled data can improve the accuracy of the learned classifiers. In this work we study how un-labeled data can be beneficial for constructing faster classifiers. We propose an SSL algorithmic framework which can utilize unlabeled examples for learning classifiers from a predefined set of fast classifiers. We formally analyze conditions under which our algorithmic paradigm obtains significant improvements by the use of unlabeled data. As a side benefit of our analysis we propose a novel quantitative measure of the so-called cluster assumption. We demonstrate the potential merits of our approach by conducting experiments on the MNIST data set, showing that, when a sufficiently large unlabeled sample is available, a fast classifier can be learned from much fewer labeled examples than without such a sample.

[1]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[2]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[3]  Matti Kääriäinen,et al.  Generalization Error Bounds Using Unlabeled Data , 2005, COLT.

[4]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[5]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[6]  Dan Klein,et al.  Structure compilation: trading structure for features , 2008, ICML '08.

[7]  Shai Ben-David,et al.  Does Unlabeled Data Provably Help? Worst-case Analysis of the Sample Complexity of Semi-Supervised Learning , 2008, COLT.

[8]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[9]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[10]  Philippe Rigollet,et al.  Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption , 2006, J. Mach. Learn. Res..

[11]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[14]  Maria-Florina Balcan,et al.  A PAC-Style Model for Learning from Labeled and Unlabeled Data , 2005, COLT.

[15]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[16]  W. Lockau,et al.  Contents , 2015 .

[17]  Dariu Gavrila,et al.  A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.