Scaling Up Semi-supervised Learning: An Efficient and Effective LLGC Variant

Domains like text classification can easily supply large amounts of unlabeled data, but labeling itself is expensive. Semisupervised learning tries to exploit this abundance of unlabeled training data to improve classification. Unfortunately most of the theoretically well-founded algorithms that have been described in recent years are cubic or worse in the total number of both labeled and unlabeled training examples. In this paper we apply modifications to the standard LLGC algorithm to improve efficiency to a point where we can handle datasets with hundreds of thousands of training data. The modifications are priming of the unlabeled data, and most importantly, sparsification of the similarity matrix. We report promising results on large text classification problems.

[1]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[2]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[3]  Kurt Driessens,et al.  Using Weighted Nearest Neighbor to Benefit from Unlabeled Data , 2006, PAKDD.

[4]  M. Griebel,et al.  Semi-supervised learning with sparse grids , 2005, ICML 2005.

[5]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[6]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[7]  Maria-Florina Balcan,et al.  A theory of learning with similarity functions , 2008, Machine Learning.

[8]  Kai Yu Blockwise Supervised Inference on Large Graphs , 2005 .

[9]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[10]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[11]  Jason Weston,et al.  Semi-supervised Protein Classification Using Cluster Kernels , 2003, NIPS.

[12]  Adam Vinueza,et al.  Unsupervised Outlier Detection and Semi-Supervised Learning , 2004 .

[13]  Markus Breitenbach,et al.  Clustering with Local and Global Consistency ; CU-CS-973-04 , 2004 .

[14]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[15]  Markus Breitenbach,et al.  Clustering with Local and Global Consistency , .

[16]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[17]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[18]  C. Oliveira Splitting the Unsupervised and Supervised Components of Semi-Supervised Learning , 2005 .

[19]  Nando de Freitas,et al.  Fast Computational Methods for Visually Guided Robots , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[20]  Nicolas Le Roux,et al.  Efficient Non-Parametric Function Induction in Semi-Supervised Learning , 2004, AISTATS.

[21]  Xiaojin Zhu,et al.  Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning , 2005, ICML.

[22]  Rosie Jones,et al.  Learning to Extract Entities from Labeled and Unlabeled Text , 2005 .

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  T. Huang Performance Comparisons of Semi-Supervised Learning Algorithms , 2005 .

[25]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[26]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[27]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[28]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[29]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[30]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[31]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[32]  Gregory Z. Grudic,et al.  Unsupervised Outlier Detection and Semi-Supervised Learning ; CU-CS-976-04 , 2004 .

[33]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[34]  Bernhard Pfahringer,et al.  A semi-supervised Spam mail detector , 2006 .

[35]  J. Davenport Editor , 1960 .

[36]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[37]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[38]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[39]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[40]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[41]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[42]  Maria-Florina Balcan,et al.  On a theory of learning with similarity functions , 2006, ICML.

[43]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[44]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[45]  Maria-Florina Balcan,et al.  Person Identification in Webcam Images: An Application of Semi-Supervised Learning , 2005 .