A Comparative Survey: Benchmarking for Pool-based Active Learning

Active learning (AL) is a subfield of machine learning (ML) in which a learning algorithm aims to achieve good accuracy with fewer training samples by interactively querying the oracles to label new data points. Pool-based AL is well-motivated in many ML tasks, where unlabeled data is abundant, but their labels are hard or costly to obtain. Although many pool-based AL methods have been developed, some important questions remain unanswered such as how to: 1) determine the current state-of-the-art technique; 2) evaluate the relative benefit of new methods for various properties of the dataset; 3) understand what specific problems merit greater attention; and 4) measure the progress of the field over time. In this paper, we survey and compare various AL strategies used in both recently proposed and classic highly-cited methods. We propose to benchmark pool-based AL methods with a variety of datasets and quantitative metric, and draw insights from the comparative empirical results.

[1]  Xiaowei Xu,et al.  Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[2]  Yusuf Yaslan,et al.  Sparse coding based classifier ensembles in supervised and active learning scenarios for data classification , 2018, Expert Syst. Appl..

[3]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[4]  Stefan Wrobel,et al.  Active Hidden Markov Models for Information Extraction , 2001, IDA.

[5]  Yue Gao,et al.  Active Learning with Cross-Class Similarity Transfer , 2017, AAAI.

[6]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Marco Loog,et al.  A benchmark and comparison of active learning for logistic regression , 2016, Pattern Recognit..

[8]  Nan Ye,et al.  Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification , 2016, AAAI.

[9]  Qiang Yang,et al.  Active Transfer Learning for Cross-System Recommendation , 2013, AAAI.

[10]  Jaime G. Carbonell,et al.  A theory of transfer learning with applications to active learning , 2013, Machine Learning.

[11]  Zhipeng Ye,et al.  Practice makes perfect: An adaptive active learning framework for image classification , 2016, Neurocomputing.

[12]  Michael R. Lyu,et al.  A semi-supervised active learning framework for image retrieval , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  John Langford,et al.  Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[14]  Min Wang,et al.  Active Learning Through Multi-Standard Optimization , 2019, IEEE Access.

[15]  Bernt Schiele,et al.  RALF: A reinforced active learning formulation for object class recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.