Multiview Semi-Supervised Learning with Consensus

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications. Semi-supervised learning aims to improve the performance of a classifier trained with limited number of labeled data by utilizing the unlabeled ones. This paper demonstrates a way to improve the transductive SVM, which is an existing semi-supervised learning algorithm, by employing a multiview learning paradigm. Multiview learning is based on the fact that for some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly informative number of domain specific features. We propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve classification result. The idea is straightforward: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint requiring each classifier to assign the same class label to each labeled and unlabeled sample. We also incorporate manifold regularization, a kind of graph-based semi-supervised learning method into our framework. The proposed two-view transductive SVM was evaluated on both synthetic and real-life data sets. Experimental results show that our algorithm performs up to 10 percent better than a single-view learning approach, especially when the amount of labeled data is small. The other advantage of our two-view semi-supervised learning approach is its significantly improved stability, which is especially useful when dealing with noisy data in real-world applications.

[1]  Gökhan Tür,et al.  Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[3]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[4]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[5]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  John Shawe-Taylor,et al.  Synthesis of maximum margin and multiview learning using unlabeled data , 2007, ESANN.

[8]  Steven C. H. Hoi,et al.  Two-View Transductive Support Vector Machines , 2010, SDM.

[9]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[10]  Nicolas Vayatis,et al.  Complexity versus Agreement for Many Views , 2009, ALT.

[11]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[12]  Mikhail Belkin,et al.  Using manifold structure for partially labelled classification , 2002, NIPS 2002.

[13]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[14]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[15]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[16]  Stefan Schaal,et al.  Proc. Advances in Neural Information Processing Systems (NIPS '08) , 2008 .

[17]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[20]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[23]  Virginia R. de Sa,et al.  Learning Classification with Unlabeled Data , 1993, NIPS.

[24]  S. Sathiya Keerthi,et al.  Optimization Techniques for Semi-Supervised Support Vector Machines , 2008, J. Mach. Learn. Res..

[25]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[26]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[27]  Vikas Sindhwani,et al.  An RKHS for multi-view learning and manifold co-regularization , 2008, ICML '08.

[28]  Alexander Zien,et al.  A continuation method for semi-supervised SVMs , 2006, ICML.

[29]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[30]  Nicholas Kushmerick,et al.  Learning to remove Internet advertisements , 1999, AGENTS '99.

[31]  Thomas Gärtner,et al.  Efficient co-regularised least squares regression , 2006, ICML.

[32]  R. Bharat Rao,et al.  Bayesian Co-Training , 2007, J. Mach. Learn. Res..

[33]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[34]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[35]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[36]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[37]  Ulf Brefeld,et al.  Co-EM support vector learning , 2004, ICML.

[38]  L. Rosasco,et al.  Manifold Regularization , 2007 .

[39]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[40]  Jason Weston,et al.  Large scale manifold transduction , 2008, ICML '08.

[41]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.