Online Coregularization for Multiview Semisupervised Learning

We propose a novel online coregularization framework for multiview semisupervised learning based on the notion of duality in constrained optimization. Using the weak duality theorem, we reduce the online coregularization to the task of increasing the dual function. We demonstrate that the existing online coregularization algorithms in previous work can be viewed as an approximation of our dual ascending process using gradient ascent. New algorithms are derived based on the idea of ascending the dual function more aggressively. For practical purpose, we also propose two sparse approximation approaches for kernel representation to reduce the computational complexity. Experiments show that our derived online coregularization algorithms achieve risk and accuracy comparable to offline algorithms while consuming less time and memory. Specially, our online coregularization algorithms are able to deal with concept drift and maintain a much smaller error rate. This paper paves a way to the design and analysis of online coregularization algorithms.

[1]  Mikhail Belkin,et al.  Maximum Margin Semi-Supervised Learning for Structured Variables , 2005, NIPS 2005.

[2]  Tom Heskes,et al.  Online Co-regularized Algorithms , 2012, Discovery Science.

[3]  Parag Kulkarni,et al.  Incremental Learning: Areas and Methods - A Survey , 2012 .

[4]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[5]  Raymond J. Mooney,et al.  Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.

[6]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[7]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[8]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[9]  Ming Li,et al.  Online Manifold Regularization: A New Learning Setting and Empirical Study , 2008, ECML/PKDD.

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[12]  Hui Zhang,et al.  Online Manifold Regularization by Dual Ascending Procedure , 2013 .

[13]  Andrew McCallum,et al.  Semi-Supervised Clustering with User Feedback , 2003 .

[14]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[15]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[16]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[17]  Nizar Grira,et al.  Unsupervised and Semi-supervised Clustering : a Brief Survey ∗ , 2004 .

[18]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[19]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[20]  Gerhard J. Woeginger,et al.  Online Algorithms , 1998, Lecture Notes in Computer Science.

[21]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[22]  Shiliang Sun,et al.  Sparse Semi-supervised Learning Using Conjugate Functions , 2010, J. Mach. Learn. Res..

[23]  John Shawe-Taylor,et al.  Synthesis of maximum margin and multiview learning using unlabeled data , 2007, ESANN.

[24]  Vladimir Naumovich Vapni The Nature of Statistical Learning Theory , 1995 .

[25]  Avrim Blum,et al.  On-line Algorithms in Machine Learning , 1996, Online Algorithms.

[26]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.