Bring Order into the Samples: A Novel Scalable Method for Influence Maximization (Extended Abstract)

Given a positive integer k, a social network G and a certain propagation model M, influence maximization aims to find a set of k nodes that has the largest influence spread. The state-of-the-art method IMM is based on the reverse influence sampling (RIS) framework. By using the martingale technique, it greatly outperforms the previous methods in efficiency. However, IMM still has limitations in scalability due to the high overhead of deciding a tight sample size. In this paper, instead of spending the effort on deciding a tight sample size, we present a novel bottomk sketch based RIS framework, namely BKRIS, which brings the order of samples into the RIS framework. By applying the sketch technique, we can derive early termination conditions to significantly accelerate the seed set selection procedure. Moreover, we provide several optimization techniques to reduce the cost of generating and processing samples. Finally, we conduct experiments over 10 real social networks to demonstrate the efficiency and effectiveness of the proposed method. Further details are reported in [1].

[1]  C. A. R. Hoare,et al.  Algorithm 65: find , 1961, Commun. ACM.

[2]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[3]  Xiaokui Xiao,et al.  Influence Maximization in Near-Linear Time: A Martingale Approach , 2015, SIGMOD Conference.

[4]  Jian Pei,et al.  Continuous Influence Maximization: What Discounts Should We Offer to Social Network Users? , 2016, SIGMOD Conference.

[5]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[6]  Jinhui Tang,et al.  Online Topic-Aware Influence Maximization , 2015, Proc. VLDB Endow..

[7]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[8]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[9]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[10]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[11]  Michael D. Vose,et al.  A Linear Algorithm For Generating Random Numbers With a Given Distribution , 1991, IEEE Trans. Software Eng..

[12]  C. A. R. Hoare,et al.  Algorithm 64: Quicksort , 1961, Commun. ACM.

[13]  Laks V. S. Lakshmanan,et al.  Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret , 2014, Proc. VLDB Endow..

[14]  Xiaokui Xiao,et al.  Influence maximization: near-optimal time complexity meets practical efficiency , 2014, SIGMOD Conference.

[15]  Joel Oren,et al.  Influence at Scale: Distributed Computation of Complex Contagion in Networks , 2015, KDD.

[16]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[17]  Edith Cohen,et al.  Sketch-based Influence Maximization and Computation: Scaling up with Guarantees , 2014, CIKM.

[18]  Kyomin Jung,et al.  IRIE: Scalable and Robust Influence Maximization in Social Networks , 2011, 2012 IEEE 12th International Conference on Data Mining.

[19]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[20]  Christian Borgs,et al.  Maximizing Social Influence in Nearly Optimal Time , 2012, SODA.

[21]  Kian-Lee Tan,et al.  Real-time Targeted Influence Maximization for Online Advertisements , 2015, Proc. VLDB Endow..

[22]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Laks V. S. Lakshmanan,et al.  SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[24]  My T. Thai,et al.  Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-scale Networks , 2016, SIGMOD Conference.

[25]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[26]  R. A. Fisher,et al.  Statistical Tables for Biological, Agricultural and Medical Research , 1956 .

[27]  Edith Cohen,et al.  Summarizing data using bottom-k sketches , 2007, PODC '07.

[28]  Peter J. Haas,et al.  On synopses for distinct-value estimation under multiset operations , 2007, SIGMOD '07.