Near-optimal Nonmyopic Value of Information in Graphical Models

A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 - 1/e – e) approximation guarantee for any e > 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 - 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex real-world datasets.

[1]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[2]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[3]  Satoru Fujishige,et al.  Polymatroidal Dependence Structure of a Set of Random Variables , 1978, Inf. Control..

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  Eric Horvitz,et al.  An Approximate Nonmyopic Computation for Value of Information , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  L. van der Gaag,et al.  Selective evidence gathering for diagnostic belief networks , 1993 .

[8]  Maurice Queyranne,et al.  An Exact Algorithm for Maximum Entropy Sampling , 1995, Oper. Res..

[9]  Finn Verner Jensen,et al.  Myopic Value of Information in Influence Diagrams , 1997, UAI.

[10]  U. Feige A threshold of ln n for approximating set cover , 1998, JACM.

[11]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[12]  Boris Goldengorin,et al.  The maximization of submodular functions , 1999 .

[13]  Wei Hong,et al.  Model-Driven Data Acquisition in Sensor Networks , 2004, VLDB.

[14]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[15]  Valentina Bayer-Zubek Learning diagnostic policies from examples by systematic search , 2004, UAI 2004.

[16]  Andreas Krause,et al.  Near-optimal sensor placements in Gaussian processes , 2005, ICML.

[17]  Andreas Krause,et al.  Optimal Nonmyopic Value of Information in Graphical Models - Efficient Algorithms and Theoretical Limits , 2005, IJCAI.

[18]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[19]  Chris Bailey-Kellogg,et al.  Gaussian Processes for Active Data Mining of Spatial Aggregates , 2005, SDM.