Influence Maximization in Near-Linear Time: A Martingale Approach

Given a social network G and a positive integer k, the influence maximization problem asks for k nodes (in G) whose adoptions of a certain idea or product can trigger the largest expected number of follow-up adoptions by the remaining nodes. This problem has been extensively studied in the literature, and the state-of-the-art technique runs in O((k+l) (n+m) log n ε2) expected time and returns a (1-1 e-ε)-approximate solution with at least 1 - 1/n l probability. This paper presents an influence maximization algorithm that provides the same worst-case guarantees as the state of the art, but offers significantly improved empirical efficiency. The core of our algorithm is a set of estimation techniques based on martingales, a classic statistical tool. Those techniques not only provide accurate results with small computation overheads, but also enable our algorithm to support a larger class of information diffusion models than existing methods do. We experimentally evaluate our algorithm against the states of the art under several popular diffusion models, using real social networks with up to 1.4 billion edges. Our experimental results show that the proposed algorithm consistently outperforms the states of the art in terms of computation efficiency, and is often orders of magnitude faster.

[1]  Le Song,et al.  Scalable Influence Estimation in Continuous-Time Diffusion Networks , 2013, NIPS.

[2]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[3]  Shishir Bharathi,et al.  Competitive Influence Maximization in Social Networks , 2007, WINE.

[4]  Michael D. Vose,et al.  A Linear Algorithm For Generating Random Numbers With a Given Distribution , 1991, IEEE Trans. Software Eng..

[5]  Wei Chen,et al.  Scalable influence maximization for independent cascade model in large-scale social networks , 2012, Data Mining and Knowledge Discovery.

[6]  Bernhard Schölkopf,et al.  Influence Maximization in Continuous Time Diffusion Networks , 2012, ICML.

[7]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[8]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[9]  Lior Seeman,et al.  Adaptive Seeding in Social Networks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[10]  Edith Cohen,et al.  Size-Estimation Framework with Applications to Transitive Closure and Reachability , 1997, J. Comput. Syst. Sci..

[11]  Éva Tardos,et al.  Influential Nodes in a Diffusion Model for Social Networks , 2005, ICALP.

[12]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[13]  Laks V. S. Lakshmanan,et al.  SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[14]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[15]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[16]  Fan Chung Graham,et al.  Concentration Inequalities and Martingale Inequalities: A Survey , 2006, Internet Math..

[17]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data: Lawless/Statistical , 2002 .

[18]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[19]  Christian Borgs,et al.  Maximizing Social Influence in Nearly Optimal Time , 2012, SODA.

[20]  Dong Xu,et al.  Time Constrained Influence Maximization in Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[21]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[22]  David Williams,et al.  Probability with Martingales , 1991, Cambridge mathematical textbooks.

[23]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[24]  Kyomin Jung,et al.  IRIE: Scalable and Robust Influence Maximization in Social Networks , 2011, 2012 IEEE 12th International Conference on Data Mining.

[25]  Xiaokui Xiao,et al.  Influence maximization: near-optimal time complexity meets practical efficiency , 2014, SIGMOD Conference.

[26]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[27]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[28]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[29]  Gordon Johnston,et al.  Statistical Models and Methods for Lifetime Data , 2003, Technometrics.

[30]  Jacob Goldenberg,et al.  Using Complex Systems Analysis to Advance Marketing Theory Development , 2001 .

[31]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[32]  Jinha Kim,et al.  Scalable and parallelizable processing of influence maximization for large-scale social networks? , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[33]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[34]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.

[35]  Ning Zhang,et al.  Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process , 2012, AAAI.

[36]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..