On Dynamic Selection of the Most Informative Samples in Classification Problems

In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.

[1]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[2]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[3]  Edwin Lughofer,et al.  Extensions of vector quantization for incremental clustering , 2008, Pattern Recognit..

[4]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[5]  Kongqiao Wang,et al.  Active learning for image retrieval with Co-SVM , 2007, Pattern Recognit..

[6]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Douglas Eck,et al.  Aggregate features and ADABOOST for music classification , 2006, Machine Learning.

[8]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[9]  Kenji Fukumizu,et al.  Statistical active learning in multilayer perceptrons , 2000, IEEE Trans. Neural Networks Learn. Syst..

[10]  Michael J. Carey,et al.  A comparison of features for speech, music discrimination , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[11]  R. Jones,et al.  Active Learning with Feedback on Both Features and Instances , 2006 .

[12]  Edwin Lughofer,et al.  On-line evolving image classifiers and their application to surface inspection , 2010, Image Vis. Comput..

[13]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[14]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[15]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[16]  Edwin Lughofer,et al.  Evolving Vector Quantization for Classification of On-Line Data Streams , 2008, 2008 International Conference on Computational Intelligence for Modelling Control & Automation.

[17]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[18]  Eyke Hüllermeier,et al.  FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers , 2009, IEEE Transactions on Fuzzy Systems.

[19]  Gert Cauwenberghs,et al.  SVM incremental learning, adaptation and optimization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[20]  Edwin Lughofer,et al.  Assessment of the influence of adaptive components in trainable surface inspection systems , 2010, Machine Vision and Applications.

[21]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[22]  Edwin Lughofer,et al.  Impact of object extraction methods on classification performance in surface inspection systems , 2010, Machine Vision and Applications.