A fractional memory-efficient approach for online continuous-time influence maximization

Influence maximization (IM) under a continuous-time diffusion model requires finding a set of initial adopters which when activated lead to the maximum expected number of users becoming activated within a given amount of time. State-of-the-art approximation algorithms applicable to solving this intractable problem use reverse reachability influence samples to approximate the diffusion process. Unfortunately, these algorithms require storing large collections of such samples which can become prohibitive depending on the desired solution quality, properties of the diffusion process and seed set size. To remedy this, we design an algorithm that allows the influence samples to be processed in a streaming manner, avoiding the need to store them. We approach IM using two fractional objectives: a fractional relaxation and a multi-linear extension of the original objective function. We derive a progressively improved upper bound to the optimal solution, which we empirically find to be tighter than the best existing upper bound. This enables instance-dependent solution quality guarantees that are observed to be vastly superior to the theoretical worst case. Leveraging these, we develop an algorithm that delivers solutions with a superior empirical solution quality guarantee at comparable running time with greatly reduced memory usage compared to the state-of-the-art. We demonstrate the superiority of our approach via extensive experiments on five real datasets of varying sizes of up to 41M nodes and 1.5B edges.

[1]  Ken-ichi Kawarabayashi,et al.  Coarsening Massive Influence Networks for Scalable Diffusion Analysis , 2017, SIGMOD Conference.

[2]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[3]  Wei Chen,et al.  An Issue in the Martingale Analysis of the Influence Maximization Algorithm IMM , 2018, CSoNet.

[4]  David P. Williamson,et al.  A new \frac34-approximation algorithm for MAX SAT , 1993, Conference on Integer Programming and Combinatorial Optimization.

[5]  Junsong Yuan,et al.  Online Processing Algorithms for Influence Maximization , 2018, SIGMOD Conference.

[6]  Amit Chakrabarti,et al.  Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover , 2015, SODA.

[7]  Francesco Bonchi,et al.  The Meme Ranking Problem: Maximizing Microblogging Virality , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[8]  Shourya Roy,et al.  Holistic Influence Maximization: Combining Scalability and Efficiency with Opinion-Aware Models , 2016, SIGMOD Conference.

[9]  Amin Karbasi,et al.  Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap , 2017, AISTATS.

[10]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[11]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[12]  Le Song,et al.  Learning Networks of Heterogeneous Influence , 2012, NIPS.

[13]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[14]  Martin J. Wainwright,et al.  Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..

[15]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[16]  Jacob Goldenberg,et al.  Using Complex Systems Analysis to Advance Marketing Theory Development , 2001 .

[17]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[18]  Laks V. S. Lakshmanan,et al.  SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[19]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[20]  Richard M. Karp,et al.  An Optimal Algorithm for Monte Carlo Estimation , 2000, SIAM J. Comput..

[21]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[22]  Wei Chen,et al.  IMRank: influence maximization via finding self-consistent ranking , 2014, SIGIR.

[23]  David P. Williamson,et al.  New 3⁄4 - Approximation Algorithms for MAX SAT , 2001 .

[24]  Amin Karbasi,et al.  Online Continuous Submodular Maximization , 2018, AISTATS.

[25]  Siddhartha Bhattacharyya,et al.  Large-Scale Network Analysis for Online Social Brand Advertising , 2016, MIS Q..

[26]  Naoto Ohsaka,et al.  The Solution Distribution of Influence Maximization: A High-level Experimental Study on Three Algorithmic Approaches , 2020, SIGMOD Conference.

[27]  Zhewei Wei,et al.  Influence Maximization Revisited: Efficient Reverse Reachable Set Generation with Bound Tightened , 2020, SIGMOD Conference.

[28]  Vahab S. Mirrokni,et al.  Almost Optimal Streaming Algorithms for Coverage Problems , 2016, SPAA.

[29]  Jan Vondrák,et al.  Maximizing a Submodular Set Function Subject to a Matroid Constraint (Extended Abstract) , 2007, IPCO.

[30]  Christian Borgs,et al.  Maximizing Social Influence in Nearly Optimal Time , 2012, SODA.

[31]  Kyomin Jung,et al.  IRIE: Scalable and Robust Influence Maximization in Social Networks , 2011, 2012 IEEE 12th International Conference on Data Mining.

[32]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[33]  Xiaokui Xiao,et al.  Influence maximization: near-optimal time complexity meets practical efficiency , 2014, SIGMOD Conference.

[34]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[35]  Yun Chi,et al.  Information flow modeling based on diffusion rate for prediction and ranking , 2007, WWW '07.

[36]  Le Song,et al.  Scalable Influence Estimation in Continuous-Time Diffusion Networks , 2013, NIPS.

[37]  Piotr Indyk,et al.  Towards Tight Bounds for the Streaming Set Cover Problem , 2015, PODS.

[38]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[39]  Edith Cohen,et al.  Sketch-based Influence Maximization and Computation: Scaling up with Guarantees , 2014, CIKM.

[40]  David P. Williamson,et al.  New 3/4-Approximation Algorithms for the Maximum Satisfiability Problem , 1994, SIAM J. Discret. Math..

[41]  Xiang Li,et al.  Why approximate when you can get the exact? Optimal targeted viral marketing at scale , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[42]  H. N. Shapiro Note on a Computation Method in the Theory of Games , 1958 .

[43]  Ken-ichi Kawarabayashi,et al.  NoSingles: a space-efficient algorithm for influence maximization , 2018, SSDBM.

[44]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[45]  My T. Thai,et al.  Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-scale Networks , 2016, SIGMOD Conference.

[46]  Takuya Akiba,et al.  Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations , 2014, AAAI.

[47]  Miklos Sarvary,et al.  Advertising to a social network , 2011 .

[48]  A. Huitson,et al.  Statistical Models in Applied Science. , 1976 .

[49]  Wei Chen,et al.  Scalable influence maximization for independent cascade model in large-scale social networks , 2012, Data Mining and Knowledge Discovery.

[50]  Lise Getoor,et al.  On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch , 2009, SDM.

[51]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.

[52]  Maxim Sviridenko,et al.  Pipage Rounding: A New Method of Constructing Algorithms with Proven Performance Guarantee , 2004, J. Comb. Optim..

[53]  David L. Gibbs,et al.  Solving the influence maximization problem reveals regulatory organization of the yeast cell cycle , 2016, bioRxiv.

[54]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[55]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[56]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[57]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[58]  Samuel Karlin,et al.  Mathematical Methods and Theory in Games, Programming, and Economics , 1961 .

[59]  Piotr Indyk,et al.  On Streaming and Communication Complexity of the Set Cover Problem , 2014, DISC.

[60]  Xiaokui Xiao,et al.  Influence Maximization in Near-Linear Time: A Martingale Approach , 2015, SIGMOD Conference.

[61]  Andreas Krause,et al.  Stochastic Submodular Maximization: The Case of Coverage Functions , 2017, NIPS.

[62]  Sainyam Galhotra,et al.  Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study , 2017, SIGMOD Conference.

[63]  Hung T. Nguyen,et al.  Importance Sketching of Influence Dynamics in Billion-Scale Networks , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[64]  Dokyun Lee,et al.  Advertising Content and Consumer Engagement on Social Media: Evidence from Facebook , 2017, Manag. Sci..

[65]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[66]  Laks V. S. Lakshmanan,et al.  Revisiting the Stop-and-Stare Algorithms for Influence Maximization , 2017, Proc. VLDB Endow..

[67]  Kian-Lee Tan,et al.  Discovering Your Selling Points: Personalized Social Influential Tags Exploration , 2017, SIGMOD Conference.