Self-Trained LMT for Semisupervised Learning

The most important asset of semisupervised classification methods is the use of available unlabeled data combined with a clearly smaller set of labeled examples, so as to increase the classification accuracy compared with the default procedure of supervised methods, which on the other hand use only the labeled data during the training phase. Both the absence of automated mechanisms that produce labeled data and the high cost of needed human effort for completing the procedure of labelization in several scientific domains rise the need for semisupervised methods which counterbalance this phenomenon. In this work, a self-trained Logistic Model Trees (LMT) algorithm is presented, which combines the characteristics of Logistic Trees under the scenario of poor available labeled data. We performed an in depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique had better accuracy in most cases.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Luís Torgo Inductive learning of tree-based regression models[1]Available at http:www.ncc.up.pt/ , 2000 .

[3]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[4]  Zhongsheng Hua,et al.  Semi-supervised learning based on nearest neighbor rule and cut edges , 2010, Knowl. Based Syst..

[5]  Changhua Liu,et al.  Enhancement of ELM by Clustering Discrimination Manifold Regularization and Multiobjective FOA for Semisupervised Classification , 2015, Comput. Intell. Neurosci..

[6]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[7]  J. Friedman,et al.  Classification and Regression Trees (Wadsworth Statistics/Probability) , 1984 .

[8]  HuangTianqiang,et al.  A classification algorithm based on local cluster centers with a few labeled training examples , 2010 .

[9]  Yuanqing Li,et al.  A Self-Training Semi-Supervised Support Vector Machine Algorithm and its Applications in Brain Computer Interface , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[11]  Shiliang Sun,et al.  Robust Co-Training , 2011, Int. J. Pattern Recognit. Artif. Intell..

[12]  L. Torgo,et al.  Inductive learning of tree-based regression models , 1999 .

[13]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[14]  Zhen Jiang,et al.  A hybrid generative/discriminative method for semi-supervised classification , 2013, Knowl. Based Syst..

[15]  Eibe Frank,et al.  Speeding Up Logistic Model Tree Induction , 2005, PKDD.

[16]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[17]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[18]  Deng Chao Tri-Training and Data Editing Based Semi-Supervised Clustering Algorithm , 2008 .

[19]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[20]  Zehra Cataltepe,et al.  Co-training with relevant random subspaces , 2010, Neurocomputing.

[21]  Pong C. Yuen,et al.  A Boosted Co-Training Algorithm for Human Action Recognition , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  Zhi-Hua Zhou,et al.  SETRED: Self-training with Editing , 2005, PAKDD.

[24]  Kai Li,et al.  A classification algorithm based on local cluster centers with a few labeled training examples , 2010, Knowl. Based Syst..

[25]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[26]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[27]  Fabio Roli,et al.  Analysis of Co-training Algorithm with Very Small Training Sets , 2012, SSPR/SPR.

[28]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[29]  Shiliang Sun,et al.  Multiple-View Multiple-Learner Semi-Supervised Learning , 2011, Neural Processing Letters.

[30]  Siwei Luo,et al.  A random subspace method for co-training , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[31]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[32]  Driss Mammass,et al.  Self-Training using a K-Nearest Neighbor as a Base Classifier Reinforced by Support Vector Machines , 2012 .

[33]  Tao Guo,et al.  Improved Tri-training with Unlabeled Data , 2012 .

[34]  Friedhelm Schwenker,et al.  Pattern classification and clustering: A review of partially supervised learning approaches , 2014, Pattern Recognit. Lett..

[35]  Zhi-Hua Zhou,et al.  Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[36]  Jeffrey S. Simonoff,et al.  Tree Induction Vs Logistic Regression: A Learning Curve Analysis , 2001, J. Mach. Learn. Res..

[37]  Friedhelm Schwenker,et al.  Co-training by Committee: A New Semi-supervised Learning Framework , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[38]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[39]  Ashish Ghosh,et al.  Ant Based Semi-supervised Classification , 2010, ANTS Conference.

[40]  Chao Deng,et al.  A new co-training-style random forest for computer aided diagnosis , 2011, Journal of Intelligent Information Systems.

[41]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[42]  Shuang Wang,et al.  Improve the performance of co-training by committee with refinement of class probability estimations , 2014, Neurocomputing.

[43]  Jun Du,et al.  When Does Cotraining Work in Real Data? , 2011, IEEE Transactions on Knowledge and Data Engineering.

[44]  Zhi-Hua Zhou,et al.  CoTrade: Confident Co-Training With Data Editing , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  Yan Zhou,et al.  Democratic co-learning , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[46]  Hamideh Afsarmanesh,et al.  Semi-supervised self-training for decision tree classifiers , 2017, Int. J. Mach. Learn. Cybern..

[47]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..