Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.

[1]  Wei Wang,et al.  An Efficient Switching Median Filter Based on Local Outlier Factor , 2011, IEEE Signal Processing Letters.

[2]  Guoping Wang,et al.  Learning with progressive transductive Support Vector Machine , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[3]  Giovanni Felici,et al.  MISSEL: a method to identify a large number of small species-specific genomic subsequences and its application to viruses classification , 2016, BioData Mining.

[4]  Gökhan Tür,et al.  Combining active and semi-supervised learning for spoken language understanding , 2005, Speech Commun..

[5]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[6]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[7]  Fabio Cumbo,et al.  Classification of large DNA methylation datasets for identifying cancer drivers , 2018, Big Data Res..

[8]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[11]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[12]  Sotiris B. Kotsiantis,et al.  Speech Recognition Combining MFCCs and Image Features , 2016, SPECOM.

[13]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[14]  Giovanni Felici,et al.  A novel method and software for automatically classifying Alzheimer's disease patients by magnetic resonance imaging analysis , 2017, Comput. Methods Programs Biomed..

[15]  Björn W. Schuller,et al.  Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition , 2012, INTERSPEECH.

[16]  Guido Bologna,et al.  A Comparison Study on Rule Extraction from Neural Network Ensembles, Boosted Shallow Trees, and SVMs , 2018, Appl. Comput. Intell. Soft Comput..

[17]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[18]  Quynh Dao Thi Thuy,et al.  Graph-based semisupervised and manifold learning for image retrieval with SVM-based relevant feedback , 2019, J. Intell. Fuzzy Syst..

[19]  Tong Zhang,et al.  Graph-Based Semi-Supervised Learning and Spectral Kernel Design , 2008, IEEE Transactions on Information Theory.

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  Takashi Washio,et al.  Automatic Web-Page Classification by Using Machine Learning Methods , 2001, Web Intelligence.

[22]  Carlos Guestrin,et al.  XGBoost : Reliable Large-scale Tree Boosting System , 2015 .

[23]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[24]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[25]  H WittenIan,et al.  The WEKA data mining software , 2009 .

[26]  M. Stone Cross-validation:a review 2 , 1978 .

[27]  Mário A. T. Figueiredo,et al.  Boosting Algorithms: A Review of Methods, Theory, and Applications , 2012 .

[28]  Yurong Liu,et al.  A survey of deep neural network architectures and their applications , 2017, Neurocomputing.

[29]  Eduardo Coutinho,et al.  Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments , 2016, PloS one.

[30]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[31]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[32]  Jianfeng Lu,et al.  Active learning via query synthesis and nearest neighbour search , 2015, Neurocomputing.

[33]  Jordi Janer,et al.  Active learning of custom sound taxonomies in unstructured audio data , 2012, ICMR '12.

[34]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[35]  Takeo Kanade,et al.  Interactive Cell Segmentation Based on Active and Semi-Supervised Learning , 2016, IEEE Transactions on Medical Imaging.

[36]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[37]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[38]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[39]  Nikos Fazakis,et al.  A multi-scheme semi-supervised regression approach , 2019, Pattern Recognit. Lett..

[40]  Samir I. Shaheen,et al.  A Novel Active Learning Regression Framework for Balancing the Exploration-Exploitation Trade-Off , 2019, Entropy.

[41]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[42]  Esmaeil Hadavandi,et al.  A Neural Network Ensemble Classifier for Effective Intrusion Detection Using Fuzzy Clustering and Radial Basis Function Networks , 2016, Int. J. Artif. Intell. Tools.

[43]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[44]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[45]  Udo Hahn,et al.  Semi-Supervised Active Learning for Sequence Labeling , 2009, ACL.

[46]  Stefan Wrobel,et al.  Active Hidden Markov Models for Information Extraction , 2001, IDA.

[47]  Phill-Kyu Rhee,et al.  Active and semi-supervised learning for object detection with imperfect data , 2017, Cognitive Systems Research.

[48]  Marco Loog,et al.  Active learning using uncertainty information , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[49]  Nikos Fazakis,et al.  Self-trained Rotation Forest for semi-supervised learning , 2017, J. Intell. Fuzzy Syst..

[50]  George Michailidis,et al.  Graph-Based Semisupervised Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Marco Loog,et al.  A benchmark and comparison of active learning for logistic regression , 2016, Pattern Recognit..

[52]  Dilek Z. Hakkani-Tür,et al.  Active learning: theory and applications to automatic speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[53]  Huan Liu,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[54]  Mahdi Eftekhari,et al.  Omni-Ensemble Learning (OEL): Utilizing Over-Bagging, Static and Dynamic Ensemble Selection Approaches for Software Defect Prediction , 2018, Int. J. Artif. Intell. Tools.

[55]  A. Salman Avestimehr,et al.  A Sampling Theory Perspective of Graph-Based Semi-Supervised Learning , 2017, IEEE Transactions on Information Theory.

[56]  Zhi-Hua Zhou,et al.  Training SpamAssassin with Active Semi-supervised Learning , 2009, CEAS 2009.

[57]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[58]  Murat Akçakaya,et al.  Classification Active Learning Based on Mutual Information , 2016, Entropy.

[59]  Alistair A. Young,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.

[60]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[61]  Faisal Muhammad Shah,et al.  Review spam detection using active learning , 2016, 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).

[62]  Hua Chai,et al.  A novel logistic regression model combining semi-supervised learning and active learning for disease classification , 2018, Scientific Reports.

[63]  Ali Selamat,et al.  Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples , 2015, Inf. Sci..

[64]  Georgios Kostopoulos,et al.  An active learning ensemble method for regression tasks , 2020, Intell. Data Anal..

[65]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[66]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[67]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[68]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.