论文信息 - Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields

Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields

Many real-world datasets can be represented in the form of a graph whose edge weights designate similarities between instances. A discrete Gaussian random field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior covariance is the inverse of a graph Laplacian. Minimizing the trace of the predictive covariance Sigma (V-optimality) on GRFs has proven successful in batch active learning classification problems with budget constraints. However, its worst-case bound has been missing. We show that the V-optimality on GRFs as a function of the batch query set is submodular and hence its greedy selection algorithm guarantees an (1-1/e) approximation ratio. Moreover, GRF models have the absence-of-suppressor (AofS) condition. For active survey problems, we propose a similar survey criterion which minimizes 1'(Sigma)1. In practice, V-optimality criterion performs better than GPs with mutual information gain criteria and allows nonuniform costs for different nodes.

Roman Garnett | Jeff G. Schneider | Yifei Ma

[1] Andreas Krause,et al. Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[2] Abhimanyu Das,et al. Algorithms for subset selection in linear regression , 2008, STOC.

[3] Roman Garnett,et al. Bayesian Optimal Active Search and Surveying , 2012, ICML.

[4] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[5] Jiawei Han,et al. A Variance Minimization Criterion to Active Learning on Graphs , 2012, AISTATS.

[6] J. Lafferty,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[7] Matthew J. Streeter,et al. An Online Algorithm for Maximizing Submodular Functions , 2008, NIPS.

[8] Abhimanyu Das,et al. Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.