Two-View Transductive Support Vector Machines

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a reallife review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data.

[1]  Bing Liu,et al.  Review spam detection , 2007, WWW '07.

[2]  Virginia R. de Sa,et al.  Learning Classification with Unlabeled Data , 1993, NIPS.

[3]  S. Sathiya Keerthi,et al.  Optimization Techniques for Semi-Supervised Support Vector Machines , 2008, J. Mach. Learn. Res..

[4]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[5]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[8]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[9]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[12]  Junhui Wang,et al.  On Transductive Support Vector Machines , 2006 .

[13]  Alexander Zien,et al.  A continuation method for semi-supervised SVMs , 2006, ICML.

[14]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[15]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[16]  Zhu Zhang,et al.  Utility scoring of product reviews , 2006, CIKM '06.

[17]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[18]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[19]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[20]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[21]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[22]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[23]  Massih-Reza Amini,et al.  Learning Classification with Both Labeled and Unlabeled Data , 2002, ECML.

[24]  S. Sathiya Keerthi,et al.  Large scale semi-supervised linear SVMs , 2006, SIGIR.

[25]  Sham M. Kakade,et al.  An Information Theoretic Framework for Multi-view Learning , 2008, COLT.

[26]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[27]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[28]  Edward Y. Chang,et al.  Learning the unified kernel machines for classification , 2006, KDD '06.

[29]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[30]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[31]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[32]  Thorsten Joachims,et al.  Transductive Support Vector Machines , 2006, Semi-Supervised Learning.

[33]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[34]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[35]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.