A Novel Semi-Supervised SVM Based on Tri-Training

One of the main difficulties in machine learning is how to solve large-scale problems effectively, and the labeled data are limited and fairly expensive to obtain. In this paper a new semi-supervised SVM algorithm is proposed. It applies tri-training to improve SVM. The semi-supervised SVM makes use of the large number of unlabeled data to modify the classifiers iteratively. Although tri-training doesn't put any constraints on the classifier, the proposed method uses three different SVMs as the classification algorithm. Experiments on UCI datasets show that tri-training can improve the classification accuracy of SVM and can increase the difference of classifiers, the accuracy of final classifier will be higher. Theoretical analysis and experiments show that the proposed method has excellent accuracy and classification speed.

[1]  Bernhard Schölkopf,et al.  Semi-Supervised Learning (Adaptive Computation and Machine Learning) , 2006 .

[2]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[3]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[4]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[5]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[6]  Guifa Teng,et al.  Unsupervised SVM Based on p-kernels for Anomaly Detection , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[7]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[8]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[9]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[10]  Li Kun Fuzzy Multi-Class Support Vector Machine and Application in Intrusion Detection , 2005 .

[11]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[12]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[13]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[14]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.