Budgeted stream-based active learning via adaptive submodular maximization

Active learning enables us to reduce the annotation cost by adaptively selecting unlabeled instances to be labeled. For pool-based active learning, several effective methods with theoretical guarantees have been developed through maximizing some utility function satisfying adaptive submodularity. In contrast, there have been few methods for stream-based active learning based on adaptive submodularity. In this paper, we propose a new class of utility functions, policy-adaptive submodular functions, and prove this class includes many existing adaptive submodular functions appearing in real world problems. We provide a general framework based on policy-adaptive submodularity that makes it possible to convert existing pool-based methods to stream-based methods and give theoretical guarantees on their performance. In addition we empirically demonstrate their effectiveness comparing with existing heuristics on common benchmark datasets.

[1]  Kent Quanrud,et al.  Streaming Algorithms for Submodular Function Maximization , 2015, ICALP.

[2]  Andreas Krause,et al.  Near-Optimal Bayesian Active Learning with Noisy Observations , 2010, NIPS.

[3]  Sivan Sabato,et al.  Interactive Algorithms: from Pool to Stream , 2016, COLT.

[4]  Gaurav S. Sukhatme,et al.  Data-driven robotic sampling for marine ecosystem monitoring , 2015, Int. J. Robotics Res..

[5]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[6]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[7]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[8]  Amit Chakrabarti,et al.  Submodular maximization meets streaming: matchings, matroids, and more , 2013, Math. Program..

[9]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[10]  Morteza Zadimoghaddam,et al.  Submodular secretary problem and extensions , 2013, TALG.

[11]  Shaogang Gong,et al.  Stream-based joint exploration-exploitation active learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Nada Lavrac,et al.  Stream-based active learning for sentiment analysis in the financial domain , 2014, Inf. Sci..

[13]  D. Sculley,et al.  Online Active Learning Methods for Fast Label-Efficient Spam Filtering , 2007, CEAS.

[14]  Zheng Wen,et al.  Adaptive Submodular Maximization in Bandit Setting , 2013, NIPS.

[15]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[16]  Nan Ye,et al.  Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion , 2013, NIPS.

[17]  Nan Ye,et al.  Near-optimal Adaptive Pool-based Active Learning with General Loss , 2014, UAI.

[18]  Andreas Krause,et al.  Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.

[19]  Shai Shalev-Shwartz,et al.  Efficient active learning of halfspaces: an aggressive approach , 2012, J. Mach. Learn. Res..

[20]  Roy Schwartz,et al.  Improved competitive ratios for submodular secretary problems , 2011 .

[21]  U. Feige,et al.  Maximizing Non-monotone Submodular Functions , 2011 .

[22]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[23]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[24]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[25]  Alkis Gotovos,et al.  Non-Monotone Adaptive Submodular Maximization , 2015, IJCAI.