Semi-Supervised Text Classification With Universum Learning

Universum, a collection of nonexamples that do not belong to any class of interest, has become a new research topic in machine learning. This paper devises a semi-supervised learning with Universum algorithm based on boosting technique, and focuses on situations where only a few labeled examples are available. We also show that the training error of AdaBoost with Universum is bounded by the product of normalization factor, and the training error drops exponentially fast when each weak classifier is slightly better than random guessing. Finally, the experiments use four data sets with several combinations. Experimental results indicate that the proposed algorithm can benefit from Universum examples and outperform several alternative methods, particularly when insufficient labeled examples are available. When the number of labeled examples is insufficient to estimate the parameters of classification functions, the Universum can be used to approximate the prior distribution of the classification functions. The experimental results can be explained using the concept of Universum introduced by Vapnik, that is, Universum examples implicitly specify a prior distribution on the set of classification functions.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Lada A. Adamic,et al.  Information Diffusion in Computer Science Citation Networks , 2009, ICWSM.

[3]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[4]  Claire Cardie,et al.  Constrained K-means Clustering with Background Knowledge , 2001, ICML.

[5]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[6]  Jason Weston,et al.  Inference with the Universum , 2006, ICML.

[7]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[8]  Bernardete Ribeiro,et al.  Distributed Text Classification With an Ensemble Kernel-Based Learning Approach , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  Qiang Yang,et al.  Learning with Positive and Unlabeled Examples Using Topic-Sensitive PLSA , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.

[11]  Jun Gao,et al.  Online Adaboost-Based Parameterized Methods for Dynamic Distributed Network Intrusion Detection , 2014, IEEE Transactions on Cybernetics.

[12]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[13]  Xuelong Li,et al.  Semisupervised Dimensionality Reduction and Classification Through Virtual Label Regression , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Zenglin Xu,et al.  Semi-supervised Learning from General Unlabeled Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Bernhard Schölkopf,et al.  An Analysis of Inference with the Universum , 2007, NIPS.

[16]  Dan Zhang,et al.  Document clustering with universum , 2011, SIGIR.

[17]  Lluís Màrquez i Villodre,et al.  Boosting Trees for Anti-Spam Email Filtering , 2001, ArXiv.

[18]  Feiping Nie,et al.  Large-scale adaptive semi-supervised learning via unified inductive and transductive model , 2014, KDD.

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[21]  Lei Xi,et al.  Rough set and ensemble learning based semi-supervised algorithm for text classification , 2011, Expert Syst. Appl..

[22]  Bernhard Schölkopf,et al.  Prior Knowledge in Support Vector Kernels , 1997, NIPS.

[23]  Changshui Zhang,et al.  Selecting Informative Universum Sample for Semi-Supervised Learning , 2009, IJCAI.

[24]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[25]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[26]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[27]  Wei Hu,et al.  AdaBoost-Based Algorithm for Network Intrusion Detection , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[29]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[30]  Fumin Shen,et al.  {\cal U}Boost: Boosting with the Universum , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Xavier Carreras,et al.  Named Entity Extraction using AdaBoost , 2002, CoNLL.

[32]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Fei Wang,et al.  Semi-Supervised Classification with Universum , 2008, SDM.

[34]  George Michailidis,et al.  Graph-Based Semisupervised Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Vladimir Vapnik Transductive Inference and Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[36]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[37]  Taghi M. Khoshgoftaar,et al.  Software Quality Analysis of Unlabeled Program Modules With Semisupervised Clustering , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[38]  Fei Wang,et al.  Semisupervised Metric Learning by Maximizing Constraint Margin , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[39]  Xiang Ji,et al.  Document clustering with prior knowledge , 2006, SIGIR.

[40]  G. DeJong,et al.  Generative Prior Knowledge for Discriminative Classification , 2006, J. Artif. Intell. Res..

[41]  Feiping Nie,et al.  Semi-supervised orthogonal discriminant analysis via label propagation , 2009, Pattern Recognit..

[42]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[43]  Xiaojin Zhu,et al.  Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization , 2006 .

[44]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[45]  Wuyang Dai,et al.  Empirical Study of the Universum SVM Learning for High-Dimensional Data , 2009, ICANN.

[46]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[47]  Fei Wang,et al.  Semi-Supervised Clustering via Matrix Factorization , 2008, SDM.

[48]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[49]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[50]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[51]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[52]  Chien-Liang Liu,et al.  Semi-Supervised Linear Discriminant Clustering , 2014, IEEE Transactions on Cybernetics.