论文信息 - ALdataset: a benchmark for pool-based active learning

ALdataset: a benchmark for pool-based active learning

Active learning (AL) is a subfield of machine learning (ML) in which a learning algorithm could achieve good accuracy with less training samples by interactively querying a user/oracle to label new data points. Pool-based AL is well-motivated in many ML tasks, where unlabeled data is abundant, but their labels are hard to obtain. Although many pool-based AL methods have been developed, the lack of a comparative benchmarking and integration of techniques makes it difficult to: 1) determine the current state-of-the-art technique; 2) evaluate the relative benefit of new methods for various properties of the dataset; 3) understand what specific problems merit greater attention; and 4) measure the progress of the field over time. To conduct easier comparative evaluation among AL methods, we present a benchmark task for pool-based active learning, which consists of benchmarking datasets and quantitative metrics that summarize overall performance. We present experiment results for various active learning strategies, both recently proposed and classic highly-cited methods, and draw insights from the results.

Antoni B. Chan | Xueying Zhan | Antoni Bert Chan | Xueying Zhan

[1] Sethuraman Panchanathan,et al. Batch mode active sampling based on marginal probability distribution matching , 2012, TKDD.

[2] Chen Wu,et al. Multi-Class Active Learning by Integrating Uncertainty and Diversity , 2018, IEEE Access.

[3] Glencora Borradaile,et al. Batch Active Learning via Coordinated Matching , 2012, ICML.

[4] Pascal Fua,et al. Learning Active Learning from Data , 2017, NIPS.

[5] Jian Su,et al. Multi-Criteria-based Active Learning for Named Entity Recognition , 2004, ACL.

[6] Soheil Mohajer,et al. Active Learning for Top-K Rank Aggregation from Noisy Comparisons , 2017, ICML.

[7] Guiguang Ding,et al. Active Learning with Cross-Class Knowledge Transfer , 2016, AAAI.

[8] Wei Liu,et al. Exploring Representativeness and Informativeness for Active Learning , 2019, IEEE Transactions on Cybernetics.

[9] Yusuf Yaslan,et al. Sparse coding based classifier ensembles in supervised and active learning scenarios for data classification , 2018, Expert Syst. Appl..

[10] Min Wang,et al. Active Learning Through Multi-Standard Optimization , 2019, IEEE Access.

[11] Xiaowei Xu,et al. Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[12] Michael R. Lyu,et al. A semi-supervised active learning framework for image retrieval , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13] Xin Li,et al. Adaptive Active Learning for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Jaime G. Carbonell,et al. A theory of transfer learning with applications to active learning , 2013, Machine Learning.

[15] Trevor Darrell,et al. Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16] Andreas Krause,et al. Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.

[17] Purnamrita Sarkar,et al. Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..

[18] Bernt Schiele,et al. RALF: A reinforced active learning formulation for object class recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Chun-Liang Li,et al. Active Learning Using Hint Information , 2015, Neural Computation.

[20] David D. Lewis,et al. Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[21] Sanjoy Dasgupta,et al. Hierarchical sampling for active learning , 2008, ICML '08.

[22] Christian Igel,et al. Active learning with support vector machines , 2014, WIREs Data Mining Knowl. Discov..

[23] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[24] Qiang Yang,et al. Active Transfer Learning for Cross-System Recommendation , 2013, AAAI.

[25] Lixu Gu,et al. A novel active learning framework for classification: using weighted rank aggregation to achieve multiple query criteria , 2018, Pattern Recognit..

[26] Zhipeng Ye,et al. Practice makes perfect: An adaptive active learning framework for image classification , 2016, Neurocomputing.

[27] Silvio Savarese,et al. A Geometric Approach to Active Learning for Convolutional Neural Networks , 2017, ArXiv.

[28] Rong Jin,et al. Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Yue Gao,et al. Active Learning with Cross-Class Similarity Transfer , 2017, AAAI.

[30] Ashish Kapoor,et al. Active Learning with Model Selection , 2014, AAAI.

[31] Sheng-Jun Huang,et al. Self-Paced Active Learning: Query the Right Thing at the Right Time , 2019, AAAI.

[32] Nan Ye,et al. Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification , 2016, AAAI.

[33] David A. Cohn,et al. Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[34] Heng Huang,et al. New Balanced Active Learning Model and Optimization Algorithm , 2018, IJCAI.

[35] Martin Müller,et al. Towards User‐Centered Active Learning Algorithms , 2018, Comput. Graph. Forum.

[36] Jun Zhou,et al. Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[37] Zhi-Hua Zhou,et al. Cost-Effective Active Learning from Diverse Labelers , 2017, IJCAI.

[38] Sebastián Ventura,et al. Effective active learning strategy for multi-label learning , 2018, Neurocomputing.

[39] Hsuan-Tien Lin,et al. Active Learning by Learning , 2015, AAAI.