Deep Similarity-Based Batch Mode Active Learning with Exploration-Exploitation

Active learning aims to reduce manual labeling efforts by proactively selecting the most informative unlabeled instances to query. In real-world scenarios, it's often more practical to query a batch of instances rather than a single one at each iteration. To achieve this we need to keep not only the informativeness of the instances but also their diversity. Many heuristic methods have been proposed to tackle batch mode active learning problems, however, they suffer from two limitations which if addressed would significantly improve the query strategy. Firstly, the similarity amongst instances is simply calculated using the feature vectors rather than being jointly learned with the classification model. This weakens the accuracy of the diversity measurement. Secondly, these methods usually exploit the decision boundary by querying the data points close to it. However, this can be inefficient when the labeled set is too small to reveal the true boundary. In this paper, we address both limitations by proposing a deep neural network based algorithm. In the training phase, a pairwise deep network is not only trained to perform classification, but also to project data points into another space, where the similarity can be more precisely measured. In the query selection phase, the learner selects a set of instances that are maximally uncertain and minimally redundant (exploitation), as well as are most diverse from the labeled instances (exploration). We evaluate the effectiveness of the proposed method on a variety of classification tasks: MNIST classification, opinion polarity detection, and heart failure prediction. Our method outperforms the baselines with both higher classification accuracy and faster convergence rate.

[1]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[2]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[3]  Fei Wang,et al.  Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[4]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[5]  John Platt,et al.  ALADIN: Active Learning of Anomalies to Detect Intrusion , 2008 .

[6]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[7]  Kun Deng,et al.  Balancing exploration and exploitation: a new algorithm for active machine learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[10]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[11]  Shaogang Gong,et al.  Stream-Based Active Unusual Event Detection , 2010, ACCV.

[12]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[13]  Shiyu Chang,et al.  Low-Rank Sparse Feature Selection for Patient Similarity Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[14]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[15]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[16]  Sethuraman Panchanathan,et al.  Batch Mode Active Sampling Based on Marginal Probability Distribution Matching , 2013, ACM Trans. Knowl. Discov. Data.

[17]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[18]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[19]  Claudio Carpineto,et al.  Advances in Information Retrieval, 29th European Conference on IR Research, ECIR 2007, Rome, Italy, April 2-5, 2007, Proceedings , 2007, ECIR.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Bo Du,et al.  A batch-mode active learning framework by querying discriminative and representative samples for hyperspectral image classification , 2016, Neurocomputing.

[22]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[23]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Robert D. Nowak,et al.  S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification , 2015, COLT.

[25]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[26]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[29]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[31]  Michael R. Berthold,et al.  Active learning for object classification: from exploration to exploitation , 2009, Data Mining and Knowledge Discovery.

[32]  Pavlos Protopapas,et al.  Cost-Sensitive Batch Mode Active Learning: Designing Astronomical Observation by Optimizing Telescope Time and Telescope Choice , 2016, SDM.

[33]  Nikolaos Papanikolopoulos,et al.  Multi-class batch-mode active learning for image classification , 2010, 2010 IEEE International Conference on Robotics and Automation.

[34]  Shaogang Gong,et al.  Stream-based joint exploration-exploitation active learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[36]  Marc Boullé,et al.  Exploration vs. exploitation in active learning : A Bayesian approach , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[37]  Jie Tang,et al.  Batch Mode Active Learning for Networked Data , 2012, TIST.

[38]  Tao Xiang,et al.  Active Learning using Dirichlet Processes for Rare Class Discovery and Classification , 2011, BMVC.