Batch Active Learning With Two-Stage Sampling

Due to its effectiveness in training precise model using significant fewer labeled instances, active learning has been widely researched and applied. In order to reduce the time complexity of active learning so that the oracle need not wait for the algorithm to provide instance in labeling, we proposed a new active learning method, which leverages batch sampling and direct boundary annotation with a two-stage sampling strategy. In the first stage sampling, the initial seed, which determines the location of boundary annotation, is selected with reject sampling based on the clustering structure of instances to ensure the initial seeds can approximate the distribution of data and with high diversity. In the second stage sampling, by treating the instance sampling as the selection of representative in a local region and maximizing the rewards that can get from selecting a instance as the new representative, we proposed a novel mechanism to maintain local representativeness and diversity of query instances. Compared with the conventional pool-based active learning method, our proposed method does not need to train the model in each iteration, which reduces the amount of calculation and time consumption. The experimental results in three public datasets show that the proposed method has comparable performance with the uncertainty-based active learning methods, which proves that the sampling mechanism in our method is effective. It performs well without retraining the model in each iteration and does not rely on the precision of the model.

[1]  Jie Tang,et al.  Batch Mode Active Learning for Networked Data , 2012, TIST.

[2]  Kamal Nigamyknigam,et al.  Employing Em in Pool-based Active Learning for Text Classiication , 1998 .

[3]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[4]  Min Wang,et al.  Active learning through density clustering , 2017, Expert Syst. Appl..

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[7]  Maosong Sun,et al.  Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.

[8]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[9]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[10]  Jan C. van Gemert,et al.  Active Decision Boundary Annotation with Deep Generative Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Xian-Sheng Hua,et al.  Finding image exemplars using fast sparse affinity propagation , 2008, ACM Multimedia.

[12]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[13]  Sethuraman Panchanathan,et al.  Batch mode active sampling based on marginal probability distribution matching , 2012, TKDD.

[14]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[16]  Min Wang,et al.  Active Learning Through Multi-Standard Optimization , 2019, IEEE Access.

[17]  ChengXiang Zhai,et al.  Active feedback in ad hoc information retrieval , 2005, SIGIR '05.

[18]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[19]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[20]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[21]  Bernt Schiele,et al.  Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[23]  Yong Cheng Wu,et al.  Active Learning Based on Diversity Maximization , 2013 .

[24]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[26]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[27]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[28]  José Bento,et al.  Generative Adversarial Active Learning , 2017, ArXiv.

[29]  Lehel Csató,et al.  Active Learning with Clustering , 2011, Active Learning and Experimental Design @ AISTATS.

[30]  Xiaowei Xu,et al.  Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[31]  Yuval Elovici,et al.  Improving the Detection of Unknown Computer Worms Activity Using Active Learning , 2007, KI.

[32]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[33]  Shubham Bansal,et al.  Active Learning Methods for Low Resource End-to-End Speech Recognition , 2019, INTERSPEECH.

[34]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[35]  Marco Loog,et al.  A benchmark and comparison of active learning for logistic regression , 2016, Pattern Recognit..

[36]  Frédéric Precioso,et al.  Adversarial Active Learning for Deep Networks: a Margin Based Approach , 2018, ArXiv.

[37]  Radu Timofte,et al.  Adversarial Sampling for Active Learning , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[38]  Rong Jin,et al.  Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval , 2009, IEEE Transactions on Knowledge and Data Engineering.

[39]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[40]  Yuhong Guo,et al.  Active Instance Sampling via Matrix Partition , 2010, NIPS.

[41]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[42]  Rong Jin,et al.  Semi-supervised SVM batch mode active learning for image retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[44]  Clément Gosselin,et al.  Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning , 2018, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[46]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.