Online and co-regularized algorithms for large scale learning

In this work I address the issue of large scale learning in an online setting. To tackle it, I introduce a novel algorithm that enables semi-supervised learning in an online fashion. By combining state-of-the-art online methods such as Pegasos [3] with the multi-view co-regularization framework, I achieve signicantly better performance on regression and binary classication tasks. This shows that incorporation of unlabeled data is still practical even in large scale and online settings. Evaluation is done on several publicly available datasets from the UCI and LibSVM repositories. To evaluate results in a practical setting, I also consider a dicult

[1]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[2]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[3]  Johan A. K. Suykens,et al.  Advances in learning theory : methods, models and applications , 2003 .

[4]  Thomas Gärtner,et al.  Efficient co-regularised least squares regression , 2006, ICML.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Nathan Srebro,et al.  SVM optimization: inverse dependence on training set size , 2008, ICML '08.

[7]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[8]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[9]  Tapio Salakoski,et al.  Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions , 2006, Int. J. Medical Informatics.

[10]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[11]  Nathan Srebro,et al.  Beating SGD: Learning SVMs in Sublinear Time , 2011, NIPS.

[12]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[13]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[14]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[15]  D. Sculley,et al.  Large Scale Learning to Rank , 2009 .

[16]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[17]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..