Active sample selection and correction propagation on a gradually-augmented graph

When data have a complex manifold structure or the characteristics of data evolve over time, it is unrealistic to expect a graph-based semi-supervised learning method to achieve flawless classification given a small number of initial annotations. To address this issue with minimal human interventions, we propose (i) a sample selection criterion used for active query of informative samples by minimizing the expected prediction error, and (ii) an efficient correction propagation method that propagates human correction on selected samples over a gradually-augmented graph to unlabeled samples without rebuilding the affinity graph. Experimental results conducted on three real world datasets validate that our active sample selection and correction propagation algorithm quickly reaches high quality classification results with minimal human interventions.

[1]  Fei Wang,et al.  Active model selection for Graph-Based Semi-Supervised Learning , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Takeo Kanade,et al.  Cell segmentation in phase contrast microscopy images via semi-supervised classification over optics-related features , 2013, Medical Image Anal..

[3]  Jiawei Han,et al.  A Variance Minimization Criterion to Active Learning on Graphs , 2012, AISTATS.

[4]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[5]  Jiawei Han,et al.  Towards Active Learning on Graphs: An Error Bound Minimization Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[6]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  G. Sohie,et al.  Generalization of the matrix inversion lemma , 1986, Proceedings of the IEEE.

[8]  Takeo Kanade,et al.  Interactive cell segmentation based on correction propagation , 2014, 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI).

[9]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[10]  Ran El-Yaniv,et al.  Transductive Rademacher Complexity and Its Applications , 2007, COLT.

[11]  Wei Liu,et al.  Robust multi-class transductive learning with graphs , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[13]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[14]  Ian Davidson,et al.  Labels vs. Pairwise Constraints: A Unified View of Label Propagation and Constrained Spectral Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[15]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[16]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[17]  Jeff A. Bilmes,et al.  Active Semi-Supervised Learning using Submodular Functions , 2011, UAI.

[18]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[19]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[20]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[21]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[22]  Jun-Ming Xu,et al.  OASIS: Online Active Semi-Supervised Learning , 2011, AAAI.

[23]  Marc Boullé,et al.  Exploration vs. exploitation in active learning : A Bayesian approach , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[24]  Camille Couprie,et al.  Power Watershed: A Unifying Graph-Based Optimization Framework , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jan Kautz,et al.  Hierarchical Subquery Evaluation for Active Learning on a Graph , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[27]  Jeff A. Bilmes,et al.  Label Selection on Graphs , 2009, NIPS.

[28]  Ashish Kapoor,et al.  Active learning for large multi-class problems , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.