A Novel Semi-Supervised SVM Based on Tri-Training

One of the main difficulties in machine learning is how to solve large-scale problems effectively, and the labeled data are limited and fairly expensive to obtain. In this paper a new semi-supervised SVM algorithm is proposed. It applies tri-training to improve SVM. The semi-supervised SVM makes use of the large number of unlabeled data to modify the classifiers iteratively. Although tri-training doesn't put any constraints on the classifier, the proposed method uses three different SVMs as the classification algorithm. Experiments on UCI datasets show that tri-training can improve the classification accuracy of SVM and can increase the difference of classifiers, the accuracy of final classifier will be higher. Theoretical analysis and experiments show that the proposed method has excellent accuracy and classification speed.

[1]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[2]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[3]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[4]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[5]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[6]  Wei Xu,et al.  Improving one-class SVM for anomaly detection , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[7]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[8]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[9]  Guifa Teng,et al.  Unsupervised SVM Based on p-kernels for Anomaly Detection , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[10]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[11]  Sheng-Hsun Hsu,et al.  Application of SVM and ANN for intrusion detection , 2005, Comput. Oper. Res..

[12]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[13]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[14]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[19]  Li Kun Fuzzy Multi-Class Support Vector Machine and Application in Intrusion Detection , 2005 .

[20]  Andrew H. Sung,et al.  A comparative study of techniques for intrusion detection , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[21]  Hong Shen,et al.  Online training of SVMs for real-time intrusion detection , 2004, 18th International Conference on Advanced Information Networking and Applications, 2004. AINA 2004..

[22]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[23]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[24]  Hou-Kuan Huang,et al.  A novel multi-class SVM classifier based on DDAG , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[25]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.