Selecting Observations against Adversarial Objectives

In many applications, one has to actively select among a set of expensive observations before making an informed decision. Often, we want to select observations which perform well when evaluated with an objective function chosen by an adversary. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for the case where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms. We evaluate our algorithm on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms.

[1]  Thomas G. Dietterich Adaptive computation and machine learning , 1998 .

[2]  Lewis A. Rossman,et al.  The EPANET Programmer's Toolkit for Analysis of Water Distribution Systems , 1999 .

[3]  Avi Ostfeld,et al.  The Battle of the Water Sensor Networks (BWSN): A Design Challenge for Engineers and Algorithms , 2008 .

[4]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[5]  S. Gupta,et al.  Statistical decision theory and related topics IV , 1988 .

[6]  D. Wiens Robustness in spatial studies II: minimax design , 2005 .

[7]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[8]  Laurence A. Wolsey,et al.  An analysis of the greedy algorithm for the submodular set covering problem , 1982, Comb..

[9]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[10]  H. B. McMahan,et al.  Robust Submodular Observation Selection , 2008 .

[11]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[12]  Toshihiro Fujito,et al.  Approximation algorithms for submodular set cover with applications , 2000 .

[13]  R. Varga,et al.  Proof of Theorem 4 , 1983 .

[14]  S. Schwartz,et al.  An accelerated sequential algorithm for producing D -optimal designs , 1989 .

[15]  Michael I. Jordan,et al.  Robust design of biological experiments , 2005, NIPS.

[16]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[17]  Felix Schlenk,et al.  Proof of Theorem 3 , 2005 .