Information Source Detection via Maximum A Posteriori Estimation

The problem of information source detection, whose goal is to identify the source of a piece of information from a diffusion process (e.g., computer virus, rumor, epidemic, and so on), has attracted ever-increasing attention from research community in recent years. Although various methods have been proposed, such as those based on centrality, spectral and belief propagation, the existing solutions still suffer from high time complexity and inadequate effectiveness. To this end, we revisit this problem in the paper and present a comprehensive study from the perspective of likelihood approximation. Different from many previous works, we consider both infected and uninfected nodes to estimate the likelihood for the detection. Specifically, we propose a Maximum A Posteriori (MAP) estimator to detect the information source for general graphs with rumor centrality as the prior. To further improve the efficiency, we design two approximate estimators, namely Brute Force Search Approximation (BFSA) and Greedy Search Bound Approximation (GSBA). BFSA tries to traverse the permitted permutations and directly computes the likelihood, while GSBA exploits a strategy of greedy search to find a surrogate upper bound of the probabilities of permitted permutations for a given node, and derives an approximate MAP estimator. Extensive experiments on several network data sets clearly demonstrate the effectiveness of our methods in detecting the single information source.

[1]  Wenyi Zhang,et al.  Rooting our Rumor Sources in Online Social Networks: The Value of Diversity From Multiple Observations , 2015, IEEE Journal of Selected Topics in Signal Processing.

[2]  Nam P. Nguyen,et al.  Sources of misinformation in Online Social Networks: Who to suspect? , 2012, MILCOM 2012 - 2012 IEEE Military Communications Conference.

[3]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[4]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[5]  B. R. Heap Permutations by Interchanges , 1963, Comput. J..

[6]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[7]  Huan Liu,et al.  Seeking provenance of information using social media , 2013, CIKM.

[8]  Christos Faloutsos,et al.  Spotting Culprits in Epidemics: How Many and Which Ones? , 2012, 2012 IEEE 12th International Conference on Data Mining.

[9]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[10]  Hui Xiong,et al.  Influence Maximization over Large-Scale Social Networks: A Bounded Linear Approach , 2014, CIKM.

[11]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[12]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[13]  Riccardo Zecchina,et al.  Bayesian inference of epidemics on networks via Belief Propagation , 2013, Physical review letters.

[14]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[15]  Chee Wei Tan,et al.  Rooting out the rumor culprit from suspects , 2013, 2013 IEEE International Symposium on Information Theory.

[16]  Martin Vetterli,et al.  Locating the Source of Diffusion in Large-Scale Networks , 2012, Physical review letters.

[17]  Dimitrios Gunopulos,et al.  Finding effectors in social networks , 2010, KDD.

[18]  Lenka Zdeborová,et al.  Inferring the origin of an epidemy with dynamic message-passing algorithm , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Vincenzo Fioriti,et al.  Predicting the sources of an outbreak with a spectral technique , 2012, ArXiv.

[20]  Wuqiong Luo,et al.  Identifying Infection Sources and Regions in Large Networks , 2012, IEEE Transactions on Signal Processing.

[21]  Lei Ying,et al.  Information source detection in the SIR model: A sample path based approach , 2013, ITA.

[22]  W. O. Kermack,et al.  Contributions to the mathematical theory of epidemics—II. The problem of endemicity , 1991, Bulletin of mathematical biology.

[23]  C. Jordan Sur les assemblages de lignes. , 1869 .

[24]  Esteban Moro Egido,et al.  Branching Dynamics of Viral Information Spreading , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Devavrat Shah,et al.  Rumors in a Network: Who's the Culprit? , 2009, IEEE Transactions on Information Theory.

[26]  Alessandro Ingrosso,et al.  The patient-zero problem with noisy observations , 2014, 1408.0907.

[27]  L. D. Costa,et al.  Identifying the starting point of a spreading process in complex networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Edward Ott,et al.  Characterizing the dynamical importance of network nodes and links. , 2006, Physical review letters.

[29]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[30]  Huan Liu,et al.  Recovering information recipients in social media via provenance , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).