Identifying Diffusion Sources in Large Networks: A Community Structure Based Approach

The global diffusion of epidemics, rumors and computer viruses causes great damage to our society. It is critical to identify the diffusion sources and promptly quarantine them. However, most methods proposed so far are unsuitable for large networks because of their computational cost and the complex spatiotemporal diffusion processes. In this paper, we develop a community structure based approach to efficiently identify diffusion sources in large networks. We first detect the community structure of a network and assign sensors on community bridge nodes to record diffusion dynamics. From the infection time of bridge sensors, we can determine the very first infected community from which the diffusion started and spread out to other communities. This, therefore, overcomes the scalability issue in source identification problems by narrowing the set of suspects down to the first infected community. Then, to accurately locate the diffusion source from suspects, we utilize an intrinsic feature of diffusion sources that the relative infection time of any node is linear with its effective distance from the diffusion source. Thus, for each suspect, we compute the correlation coefficient to measure the degree of linear dependence between sensors' relative infection times and their effective distances from the suspect, and consider the one with the greatest correlation coefficient as the source. We evaluate our approach in two large networks containing more than 300,000 nodes, which are collected from Twitter. The experiment results show that our method can identify diffusion sources with very high degree of accuracy. Especially when the average community size shrinks, the accuracy of our approach increases dramatically.

[1]  Lei Ying,et al.  Information source detection in the SIR model: A sample path based approach , 2013, ITA.

[2]  Mahmoud Fouz,et al.  Why rumors spread so quickly in social networks , 2012, Commun. ACM.

[3]  Alireza Louni,et al.  A two-stage algorithm to estimate the source of information diffusion in social media networks , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[4]  L. D. Costa,et al.  Identifying the starting point of a spreading process in complex networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Aaron Beuhring,et al.  Beyond Blacklisting: Cyberdefense in the Era of Advanced Persistent Threats , 2014, IEEE Security & Privacy.

[6]  Vincenzo Fioriti,et al.  Predicting the sources of an outbreak with a spectral technique , 2012, ArXiv.

[7]  Mikiko Senga,et al.  Ebola virus disease in West Africa--the first 9 months of the epidemic and forward projections. , 2014, The New England journal of medicine.

[8]  Wanlei Zhou,et al.  Identifying Propagation Sources in Networks: State-of-the-Art and Comparative Studies , 2017, IEEE Communications Surveys & Tutorials.

[9]  D. Helbing,et al.  The Hidden Geometry of Complex, Network-Driven Contagion Phenomena , 2013, Science.

[10]  Yue M. Lu,et al.  A fast Monte Carlo algorithm for source localization on graphs , 2013, Optics & Photonics - Optical Engineering + Applications.

[11]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[13]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[14]  E. Lyons,et al.  Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings , 2009, Science.

[15]  Devavrat Shah,et al.  Rumors in a Network: Who's the Culprit? , 2009, IEEE Transactions on Information Theory.

[16]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[17]  Riccardo Zecchina,et al.  Bayesian inference of epidemics on networks via Belief Propagation , 2013, Physical review letters.

[18]  Martin Vetterli,et al.  Locating the Source of Diffusion in Large-Scale Networks , 2012, Physical review letters.

[19]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[20]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[21]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[22]  Eunsoo Seo,et al.  Identifying rumors and their sources in social networks , 2012, Defense + Commercial Sensing.

[23]  W. Team Ebola Virus Disease in West Africa — The First 9 Months of the Epidemic and Forward Projections , 2014 .

[24]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.