Fast and Scalable Implementations of Influence Maximization Algorithms

The Influence Maximization problem has been extensively studied in the past decade because of its practical applications in finding the key influencers in social networks. Due to the hardness of the underlying problem, existing algorithms have tried to trade off practical efficiency with approximation guarantees. Approximate solutions take several hours of compute time on modest sized real world inputs and there is a lack of effective parallel and distributed algorithms to solve this problem. In this paper, we present efficient parallel algorithms for multithreaded and distributed systems to solve the influence maximization with approximation guarantee. Our algorithms extend state-of-the-art sequential approach based on computing reverse reachability sets. We present a detailed experimental evaluation, and analyze their performance and their sensitivity to input parameters, using real world inputs. Our experimental results demonstrate significant speedup on parallel architectures. We further show a speedup of up to 586× relative to the state-of-the-art sequential baseline using 1024 nodes of a supercomputer at far greater accuracy and twice the seed set size. To the best of our knowledge, this is the first effort in parallelizing the influence maximization operation at scale.

[1]  Suh-Yin Lee,et al.  CIM: Community-Based Influence Maximization in Social Networks , 2014, TIST.

[2]  Christian Borgs,et al.  Maximizing Social Influence in Nearly Optimal Time , 2012, SODA.

[3]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[4]  Christopher C. Overall,et al.  Species-specific transcriptomic network inference of interspecies interactions , 2018, The ISME Journal.

[5]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[6]  Xiaokui Xiao,et al.  Influence maximization: near-optimal time complexity meets practical efficiency , 2014, SIGMOD Conference.

[7]  Edith Cohen,et al.  Sketch-based Influence Maximization and Computation: Scaling up with Guarantees , 2014, CIKM.

[8]  Le Song,et al.  Scalable Influence Estimation in Continuous-Time Diffusion Networks , 2013, NIPS.

[9]  Antonio Sanfilippo,et al.  Identification and Validation of Ifit1 as an Important Innate Immune Bottleneck , 2012, PloS one.

[10]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[11]  Weiyi Liu,et al.  Parallel Seed Selection for Influence Maximization Based on k-shell Decomposition , 2016, CollaborateCom.

[12]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[13]  Stephan Mertens,et al.  Random numbers for large scale distributed Monte Carlo simulations , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[15]  Mahantesh Halappanavar,et al.  Accelerating the mining of influential nodes in complex networks through community detection , 2016, Conf. Computing Frontiers.

[16]  Jinha Kim,et al.  Scalable and parallelizable processing of influence maximization for large-scale social networks? , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[17]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[18]  Joel Oren,et al.  Influence at Scale: Distributed Computation of Complex Contagion in Networks , 2015, KDD.

[19]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[20]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[21]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[22]  Chengjun Li,et al.  The effect of inhibition of PP1 and TNFα signaling on pathogenesis of SARS coronavirus , 2016, BMC Systems Biology.

[23]  Courtney Corley,et al.  Topological analysis of protein co-abundance networks identifies novel host targets important for HCV infection and pathogenesis , 2012, BMC Systems Biology.

[24]  Xiaokui Xiao,et al.  Influence Maximization in Near-Linear Time: A Martingale Approach , 2015, SIGMOD Conference.