Scalable and parallelizable processing of influence maximization for large-scale social networks?

As social network services connect people across the world, influence maximization, i.e., finding the most influential nodes (or individuals) in the network, is being actively researched with applications to viral marketing. One crucial challenge in scalable influence maximization processing is evaluating influence, which is #P-hard and thus hard to solve in polynomial time. We propose a scalable influence approximation algorithm, Independent Path Algorithm (IPA) for Independent Cascade (IC) diffusion model. IPA efficiently approximates influence by considering an independent influence path as an influence evaluation unit. IPA are also easily parallelized by simply adding a few lines of OpenMP meta-programming expressions. Also, overhead of maintaining influence paths in memory is relieved by safely throwing away insignificant influence paths. Extensive experiments conducted on large-scale real social networks show that IPA is an order of magnitude faster and uses less memory than the state of the art algorithms. Our experimental results also show that parallel versions of IPA speeds up further as the number of CPU cores increases, and more speed-up is achieved for larger datasets. The algorithms have been implemented in our demo application for influence maximization (available at http://dm.postech.ac.kr/ipa demo), which efficiently finds the most influential nodes in a social network.

[1]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[2]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[3]  Azadeh Iranmehr,et al.  Trust Management for Semantic Web , 2009, 2009 Second International Conference on Computer and Electrical Engineering.

[4]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[5]  Matthew Richardson,et al.  Trust Management for the Semantic Web , 2003, SEMWEB.

[6]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[7]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[8]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[9]  Éva Tardos,et al.  Influential Nodes in a Diffusion Model for Social Networks , 2005, ICALP.

[10]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[11]  Masahiro Kimura,et al.  Tractable Models for Information Diffusion in Social Networks , 2006, PKDD.

[12]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[13]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[14]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[15]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[16]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[17]  Nisheeth Shrivastava,et al.  Viral Marketing for Multiple Products , 2010, 2010 IEEE International Conference on Data Mining.

[18]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[19]  Michael R. Lyu,et al.  Mining social networks using heat diffusion processes for marketing candidates selection , 2008, CIKM '08.

[20]  James R. Larus,et al.  Software and the Concurrency Revolution , 2005, ACM Queue.

[21]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[22]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[23]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.