Bayesian Active Learning by Soft Mean Objective Cost of Uncertainty

To achieve label efficiency for training supervised learning models, pool-based active learning sequentially selects samples from a set of candidates as queries to label by optimizing an acquisition function. One category of existing methods adopts one-steplook-ahead strategies based on acquisition functions tailored with the learning objectives, for example based on the expected loss reduction (ELR) or the mean objective cost of uncertainty (MOCU) proposed recently. These active learning methods are optimal with the maximum classification error reduction when one considers a single query. However, it is well-known that there is no performance guarantee in the long run for these myopic methods. In this paper, we show that these methods are not guaranteed to converge to the optimal classifier of the true model because MOCU is not strictly concave. Moreover, we suggest a strictly concave approximation of MOCU—Soft MOCU—that can be used to define an acquisition function to guide Bayesian active learning with theoretical convergence guarantee. For training Bayesian classifiers with both synthetic and real-world data, our experiments demonstrate the superior performance of active learning by Soft MOCU compared to other existing methods. Proceedings of the 24 International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR: Volume 130. Copyright 2021 by the author(s).

[1]  Andreas Krause,et al.  Near-Optimal Bayesian Active Learning with Noisy Observations , 2010, NIPS.

[2]  Gustavo Carneiro,et al.  Bayesian Generative Active Deep Learning , 2019, ICML.

[3]  Edward R. Dougherty,et al.  Quantifying the Objective Cost of Uncertainty in Complex Dynamical Systems , 2013, IEEE Transactions on Signal Processing.

[4]  H. Wynn,et al.  Maximum entropy sampling and optimal Bayesian experimental design , 2000 .

[5]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[6]  Nan Ye,et al.  Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion , 2013, NIPS.

[7]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[8]  Seref Sagiroglu,et al.  The development of intuitive knowledge classifier and the modeling of domain dependent data , 2013, Knowl. Based Syst..

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[11]  Edward R. Dougherty,et al.  Quantifying the Multi-Objective Cost of Uncertainty , 2020, IEEE Access.

[12]  Yarin Gal,et al.  BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.

[13]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[14]  Percy Liang,et al.  On the Relationship between Data Efficiency and Error for Uncertainty Sampling , 2018, ICML.

[15]  Edward R. Dougherty,et al.  Optimal classifiers with minimum expected error within a Bayesian framework - Part I: Discrete and Gaussian models , 2013, Pattern Recognit..

[16]  Eric Horvitz,et al.  Selective Supervision: Guiding Supervised Learning with Decision-Theoretic Active Learning , 2007, IJCAI.

[17]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[18]  Trevor Darrell,et al.  Variational Adversarial Active Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).