Hybrid active learning for reducing the annotation effort of operators in classification systems

Active learning is understood as any form of learning in which the learning algorithm has some control over the input samples due to a specific sample selection process based on which it builds up the model. In this paper, we propose a novel active learning strategy for data-driven classifiers, which is based on unsupervised criterion during off-line training phase, followed by a supervised certainty-based criterion during incremental on-line training. In this sense, we call the new strategy hybrid active learning. Sample selection in the first phase is conducted from scratch (i.e. no initial labels/learners are needed) based on purely unsupervised criteria obtained from clusters: samples lying near cluster centers and near the borders of clusters are expected to represent the most informative ones regarding the distribution characteristics of the classes. In the second phase, the task is to update already trained classifiers during on-line mode with the most important samples in order to dynamically guide the classifier to more predictive power. Both strategies are essential for reducing the annotation and supervision effort of operators in off-line and on-line classification systems, as operators only have to label an exquisite subset of the off-line training data resp. give feedback only on specific occasions during on-line phase. The new active learning strategy is evaluated based on real-world data sets from UCI repository and collected at on-line quality control systems. The results show that an active learning based selection of training samples (1) does not weaken the classification accuracies compared to when using all samples in the training process and (2) can out-perform classifiers which are built on randomly selected data samples.

[1]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[2]  Edwin Lughofer,et al.  FLEXFIS: A Robust Incremental Learning Approach for Evolving Takagi–Sugeno Fuzzy Models , 2008, IEEE Transactions on Fuzzy Systems.

[3]  Gert Cauwenberghs,et al.  SVM incremental learning, adaptation and optimization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[4]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[5]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[6]  Edwin Lughofer,et al.  Increasing On-line Classification Performance Using Incremental Classifier Fusion , 2009, 2009 International Conference on Adaptive and Intelligent Systems.

[7]  Eyke Hüllermeier,et al.  FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers , 2009, IEEE Transactions on Fuzzy Systems.

[8]  Edwin Lughofer,et al.  Human–Machine Interaction Issues in Quality Control Based on Online Image Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[9]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[10]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[11]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[12]  Edwin Lughofer,et al.  On-line evolving image classifiers and their application to surface inspection , 2010, Image Vis. Comput..

[13]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[14]  Wei Hu,et al.  Unsupervised Active Learning Based on Hierarchical Graph-Theoretic Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Edwin Lughofer,et al.  Extensions of vector quantization for incremental clustering , 2008, Pattern Recognit..

[16]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[17]  David G. Stork,et al.  Pattern Classification , 1973 .

[18]  Plamen P. Angelov,et al.  Evolving Single- And Multi-Model Fuzzy Classifiers with FLEXFIS-Class , 2007, 2007 IEEE International Fuzzy Systems Conference.

[19]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[20]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[21]  Edwin Lughofer,et al.  Applying evolving fuzzy models with adaptive local error bars to on-line fault detection , 2008, 2008 3rd International Workshop on Genetic and Evolving Systems.

[22]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[23]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[24]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[25]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[26]  P. Kuhl,et al.  Acoustic determinants of infant preference for motherese speech , 1987 .

[27]  Robi Polikar,et al.  Can AdaBoost.M1 Learn Incrementally? A Comparison to Learn++ Under Different Combination Rules , 2006, ICANN.

[28]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[29]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[30]  Edwin Lughofer,et al.  Assessment of the influence of adaptive components in trainable surface inspection systems , 2010, Machine Vision and Applications.

[31]  Michael J. Carey,et al.  A comparison of features for speech, music discrimination , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[32]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[33]  Douglas Eck,et al.  Aggregate features and ADABOOST for music classification , 2006, Machine Learning.

[34]  Ion Muslea,et al.  Active Learning with Multiple Views , 2009, Encyclopedia of Data Warehousing and Mining.

[35]  Kenji Fukumizu,et al.  Statistical active learning in multilayer perceptrons , 2000, IEEE Trans. Neural Networks Learn. Syst..

[36]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[37]  Edwin Lughofer,et al.  Evolving Vector Quantization for Classification of On-Line Data Streams , 2008, 2008 International Conference on Computational Intelligence for Modelling Control & Automation.

[38]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[39]  L. Rabiner,et al.  The acoustics, speech, and signal processing society - A historical perspective , 1984, IEEE ASSP Magazine.

[40]  LungShung-Yung Rapid and brief communication , 2007 .

[41]  Edwin Lughofer,et al.  Evolving Fuzzy Systems - Methodologies, Advanced Concepts and Applications , 2011, Studies in Fuzziness and Soft Computing.

[42]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[43]  Ching Y. Suen,et al.  Data-driven decomposition for multi-class classification , 2008, Pattern Recognit..

[44]  Edwin Lughofer,et al.  Impact of object extraction methods on classification performance in surface inspection systems , 2010, Machine Vision and Applications.

[45]  Daniela Fogli,et al.  Visual Interactive Systems for End-User Development: A Model-Based Design Methodology , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[46]  Geoff Hulten,et al.  Catching up with the Data: Research Issues in Mining Data Streams , 2001, DMKD.

[47]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[48]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[49]  Nikola K. Kasabov,et al.  Evolving fuzzy neural networks for supervised/unsupervised online knowledge-based learning , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[50]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[51]  L X Wang,et al.  Fuzzy basis functions, universal approximation, and orthogonal least-squares learning , 1992, IEEE Trans. Neural Networks.

[52]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[53]  Kongqiao Wang,et al.  Active learning for image retrieval with Co-SVM , 2007, Pattern Recognit..