SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model

There is significant current interest in the problem of influence maximization: given a directed social network with influence weights on edges and a number k, find k seed nodes such that activating them leads to the maximum expected number of activated nodes, according to a propagation model. Kempe et al. showed, among other things, that under the Linear Threshold Model, the problem is NP-hard, and that a simple greedy algorithm guarantees the best possible approximation factor in PTIME. However, this algorithm suffers from various major performance drawbacks. In this paper, we propose Simpath, an efficient and effective algorithm for influence maximization under the linear threshold model that addresses these drawbacks by incorporating several clever optimizations. Through a comprehensive performance study on four real data sets, we show that Simpath consistently outperforms the state of the art w.r.t. running time, memory consumption and the quality of the seed set chosen, measured in terms of expected influence spread achieved.

[1]  Masahiro Kimura,et al.  Tractable Models for Information Diffusion in Social Networks , 2006, PKDD.

[2]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[3]  Ching-Yung Lin,et al.  Personalized recommendation driven by information flow , 2006, SIGIR.

[4]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[5]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[6]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[7]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[8]  BonchiFrancesco,et al.  A data-based approach to social influence maximization , 2011, VLDB 2011.

[9]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[10]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[11]  D. Kroft,et al.  All paths through a maze , 1967 .

[12]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[13]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[14]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[15]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[16]  George Karakostas,et al.  A better approximation ratio for the vertex cover problem , 2005, TALG.

[17]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[18]  Rossano Schifanella,et al.  Folks in Folksonomies: social link prediction from shared metadata , 2010, WSDM '10.

[19]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.

[20]  Donald B. Johnson,et al.  Finding All the Elementary Circuits of a Directed Graph , 1975, SIAM J. Comput..

[21]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[22]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .