Cost-effective outbreak detection in networks

Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?. These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information asquickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of "submodularity". We exploit submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs. We evaluate our approach on several large real-world problems,including a model of a water distribution network from the EPA, andreal blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.

[1]  Alexander Grey,et al.  The Mathematical Theory of Infectious Diseases and Its Applications , 1977 .

[2]  N. Ling The Mathematical Theory of Infectious Diseases and its applications , 1978 .

[3]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[4]  S. Schwartz,et al.  An accelerated sequential algorithm for producing D -optimal designs , 1989 .

[5]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[6]  Lewis A. Rossman,et al.  The EPANET Programmer's Toolkit for Analysis of Water Distribution Systems , 1999 .

[7]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[8]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[9]  Alessandro Vespignani,et al.  Immunization of complex networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[11]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[12]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[13]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[14]  Reuven Cohen,et al.  Efficient immunization strategies for computer networks and populations. , 2002, Physical review letters.

[15]  Avi Ostfeld,et al.  Optimal Layout of Early Warning Detection Stations for Water Distribution Systems Security , 2004 .

[16]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[17]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[18]  A. Gionis,et al.  Models and Algorithms for Network Immunization , 2005 .

[19]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[20]  Matthew Hurst,et al.  Deriving marketing intelligence from online discussion , 2005, KDD '05.

[21]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22]  Cynthia A. Phillips,et al.  Sensor Placement in Municipal Water Networks with Temporal Integer Programming Models , 2006 .

[23]  Jure Leskovec,et al.  Patterns of Influence in a Recommendation Network , 2006, PAKDD.

[24]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[25]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[26]  Zoran Kapelan,et al.  An Efficient Algorithm for Sensor Placement in Water Distribution Systems , 2008 .

[27]  Avi Ostfeld,et al.  The Battle of the Water Sensor Networks (BWSN): A Design Challenge for Engineers and Algorithms , 2008 .

[28]  Roberto Gueli PREDATOR - PREY MODEL FOR DISCRETE SENSOR PLACEMENT , 2008 .

[29]  Andreas Krause,et al.  Efficient Sensor Placement Optimization for Securing Large Water Distribution Networks , 2008 .