Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization

Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks.

[1]  Shai Shalev-Shwartz,et al.  Active Learning Halfspaces under Margin Assumptions , 2011, ArXiv.

[2]  Andreas Krause,et al.  Dynamic Resource Allocation in Conservation Planning , 2011, AAAI.

[3]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[4]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[5]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[6]  Chaitanya Swamy,et al.  Approximation Algorithms for the Firefighter Problem: Cuts over Time and Submodularity , 2009, ISAAC.

[7]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[8]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[9]  Jan Vondrák,et al.  Submodularity in Combinatorial Optimization , 2007 .

[10]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[11]  Prateek Jain,et al.  Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[13]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[14]  László Lovász,et al.  Hit-and-run mixes fast , 1999, Math. Program..

[15]  Claudio Gentile,et al.  Active Learning on Trees and Graphs , 2010, COLT.

[16]  Amin Saberi,et al.  Stochastic Submodular Maximization , 2008, WINE.

[17]  Robert L. Smith,et al.  Efficient Monte Carlo Procedures for Generating Points Uniformly Distributed over Bounded Regions , 1984, Oper. Res..

[18]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[19]  Maxim Sviridenko,et al.  Pipage Rounding: A New Method of Constructing Algorithms with Proven Performance Guarantee , 2004, J. Comb. Optim..

[20]  Andreas Krause,et al.  Near-Optimal Bayesian Active Learning with Noisy Observations , 2010, NIPS.

[21]  Jeff A. Bilmes,et al.  Interactive Submodular Set Cover , 2010, ICML.

[22]  Jeff A. Bilmes,et al.  Active Semi-Supervised Learning using Submodular Functions , 2011, UAI.

[23]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..