论文信息 - Semi-supervised multi-class Adaboost by exploiting unlabeled data

Semi-supervised multi-class Adaboost by exploiting unlabeled data

Research highlights? We propose a semi-supervised learning method by using the multi-class boosting. ? It handles K-class classification without reducing into multiple two-class problems. ? The classification accuracy of base classifier requires only 1/K or better. ? Higher classification accuracy is achieved by exploiting the unlabeled data. Semi-supervised learning has attracted much attention in pattern recognition and machine learning. Most semi-supervised learning algorithms are proposed for binary classification, and then extended to multi-class cases by using approaches such as one-against-the-rest. In this work, we propose a semi-supervised learning method by using the multi-class boosting, which can directly classify the multi-class data and achieve high classification accuracy by exploiting the unlabeled data. There are two distinct features in our proposed semi-supervised learning approach: (1) handling multi-class cases directly without reducing them to multiple two-class problems, and (2) the classification accuracy of each base classifier requiring only at least 1/K or better than 1/K (K is the number of classes). Experimental results show that the proposed method is effective based on the testing of 21 UCI benchmark data sets.

[1] Ronald Rosenfeld,et al. Semi-supervised learning with graphs , 2005 .

[2] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[3] Yi Liu,et al. SemiBoost: Boosting for Semi-Supervised Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] David W. Aha,et al. Instance-Based Learning Algorithms , 1991, Machine Learning.

[5] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[6] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[7] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[8] Trevor Hastie,et al. Multi-class AdaBoost ∗ , 2009 .

[9] Pat Langley,et al. Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[10] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11] Bernhard Schölkopf,et al. Learning with Local and Global Consistency , 2003, NIPS.

[12] Robert D. Nowak,et al. Unlabeled data: Now it helps, now it doesn't , 2008, NIPS.

[13] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[14] Xiaojin Zhu,et al. --1 CONTENTS , 2006 .

[15] Christophe Ambroise,et al. Semi-supervised MarginBoost , 2001, NIPS.

[16] Ian H. Witten,et al. Weka-A Machine Learning Workbench for Data Mining , 2005, Data Mining and Knowledge Discovery Handbook.

[17] Ayhan Demiriz,et al. Exploiting unlabeled data in ensemble methods , 2002, KDD.

[18] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[19] Pedro M. Domingos,et al. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[20] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[21] G. Wahba,et al. Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .