论文信息 - Two-View Transductive Support Vector Machines - 字舞流文

Two-View Transductive Support Vector Machines

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a reallife review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data.

Steven C. H. Hoi | Kuiyu Chang | Guangxia Li | S. Hoi | Kuiyu Chang | Guangxia Li

[1] Bing Liu,et al. Review spam detection , 2007, WWW '07.

[2] Virginia R. de Sa,et al. Learning Classification with Unlabeled Data , 1993, NIPS.

[3] S. Sathiya Keerthi,et al. Optimization Techniques for Semi-Supervised Support Vector Machines , 2008, J. Mach. Learn. Res..

[4] Alan L. Yuille,et al. The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[5] Gunnar Rätsch,et al. Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[6] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[7] Rong Jin,et al. Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[8] Dimitri P. Bertsekas,et al. Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[9] David M. Pennock,et al. Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[10] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[11] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[12] Junhui Wang,et al. On Transductive Support Vector Machines , 2006 .

[13] Alexander Zien,et al. A continuation method for semi-supervised SVMs , 2006, ICML.

[14] Alexander Zien,et al. Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[15] Yi Yang,et al. Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[16] Zhu Zhang,et al. Utility scoring of product reviews , 2006, CIKM '06.

[17] Trevor Darrell,et al. Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[18] Xiaojin Zhu,et al. Semi-Supervised Learning Literature Survey , 2005 .

[19] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[20] Jason Weston,et al. Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[21] Bing Liu,et al. Opinion spam and analysis , 2008, WSDM '08.

[22] Soo-Min Kim,et al. Automatically Assessing Review Helpfulness , 2006, EMNLP.

[23] Massih-Reza Amini,et al. Learning Classification with Both Labeled and Unlabeled Data , 2002, ECML.

[24] S. Sathiya Keerthi,et al. Large scale semi-supervised linear SVMs , 2006, SIGIR.

[25] Sham M. Kakade,et al. An Information Theoretic Framework for Multi-view Learning , 2008, COLT.

[26] Mikhail Belkin,et al. Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[27] Mikhail Belkin,et al. A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[28] Edward Y. Chang,et al. Learning the unified kernel machines for classification , 2006, KDD '06.

[29] John Shawe-Taylor,et al. Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[30] Ming Zhou,et al. Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[31] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[32] Thorsten Joachims,et al. Transductive Support Vector Machines , 2006, Semi-Supervised Learning.

[33] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[34] Zoubin Ghahramani,et al. Learning from labeled and unlabeled data with label propagation , 2002 .

[35] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.