Detecting sources of computer viruses in networks: theory and experiment

We provide a systematic study of the problem of finding the source of a computer virus in a network. We model virus spreading in a network with a variant of the popular SIR model and then construct an estimator for the virus source. This estimator is based upon a novel combinatorial quantity which we term rumor centrality. We establish that this is an ML estimator for a class of graphs. We find the following surprising threshold phenomenon: on trees which grow faster than a line, the estimator always has non-trivial detection probability, whereas on trees that grow like a line, the detection probability will go to 0 as the network grows. Simulations performed on synthetic networks such as the popular small-world and scale-free networks, and on real networks such as an internet AS network and the U.S. electric power grid network, show that the estimator either finds the source exactly or within a few hops in different network topologies. We compare rumor centrality to another common network centrality notion known as distance centrality. We prove that on trees, the rumor center and distance center are equivalent, but on general networks, they may differ. Indeed, simulations show that rumor centrality outperforms distance centrality in finding virus sources in networks which are not tree-like.

[1]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[2]  Alexander Grey,et al.  The Mathematical Theory of Infectious Diseases and Its Applications , 1977 .

[3]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[4]  N. Ling The Mathematical Theory of Infectious Diseases and its applications , 1978 .

[5]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[8]  M. Newman,et al.  Epidemics and percolation in small-world networks. , 1999, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[9]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[10]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Gavin J. Gibson,et al.  Statistical inference for stochastic epidemic models , 2002 .

[12]  Donald F. Towsley,et al.  The effect of network topology on the spread of epidemics , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[13]  Tadashi Dohi,et al.  Statistical Inference of Computer Virus Propagation Using Non-Homogeneous Poisson Processes , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).