A Billion-Scale Approximation Algorithm for Maximizing Benefit in Viral Marketing

Online social networks have been one of the most effective platforms for marketing and advertising. Through the “world-of-mouth” exchanges, so-called viral marketing, the influence and product adoption can spread from few key influencers to billions of users in the network. To identify those key influencers, a great amount of work has been devoted for the influence maximization (<italic>IM</italic>) problem that seeks a set of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> seed users that maximize the expected influence. Unfortunately, <italic>IM</italic> encloses two impractical assumptions: 1) any seed user can be acquired with the same cost and 2) all users are equally interested in the advertisement. In this paper, we propose a new problem, called <italic>cost-aware targeted viral marketing</italic> ( <italic>CTVM</italic>), to find the most cost-effective seed users, who can influence the most relevant users to the advertisement. Since <italic>CTVM</italic> is NP-hard, we design an efficient <inline-formula> <tex-math notation="LaTeX">$(1- 1/\sqrt {e}-\epsilon )$ </tex-math></inline-formula>-approximation algorithm, named Billion-scale Cost-award Targeted algorithm (BCT), to solve the problem in billion-scale networks. Comparing with <italic>IM</italic> algorithms, we show that <italic>BCT</italic> is both theoretically and experimentally faster than the state-of-the-arts while providing better solution quality. Moreover, we prove that under the linear threshold model, <italic>BCT</italic> is the first <italic>sub-linear time</italic> algorithm for <italic>CTVM</italic> (and <italic>IM</italic>) in dense networks. We carry a comprehensive set of experiments on various real-networks with sizes up to several billion edges in diverse disciplines to show the absolute superiority of <italic>BCT</italic> on both <italic>CTVM</italic> and <italic>IM</italic> domains. Experiments on Twitter data set, containing 1.46 billions of social relations and 106 millions tweets, show that <italic>BCT</italic> can identify key influencers in trending topics in only few minutes.

[1]  Nicola Barbieri,et al.  Online Topic-aware Influence Maximization Queries , 2014, EDBT.

[2]  Jinhui Tang,et al.  Online Topic-Aware Influence Maximization , 2015, Proc. VLDB Endow..

[3]  Nicola Barbieri,et al.  Topic-Aware Social Influence Propagation Models , 2012, ICDM.

[4]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[5]  Alastair J. Walker,et al.  An Efficient Method for Generating Discrete Random Variables with General Distributions , 1977, TOMS.

[6]  Xiaokui Xiao,et al.  Influence maximization: near-optimal time complexity meets practical efficiency , 2014, SIGMOD Conference.

[7]  Edith Cohen,et al.  Sketch-based Influence Maximization and Computation: Scaling up with Guarantees , 2014, CIKM.

[8]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[9]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[10]  Nam P. Nguyen,et al.  Analysis of misinformation containment in online social networks , 2013, Comput. Networks.

[11]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[12]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[13]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[14]  Kian-Lee Tan,et al.  Real-time Targeted Influence Maximization for Online Advertisements , 2015, Proc. VLDB Endow..

[15]  Ning Chen,et al.  On the approximability of influence in social networks , 2008, SODA '08.

[16]  Rong Zheng,et al.  On Budgeted Influence Maximization in Social Networks , 2012, IEEE Journal on Selected Areas in Communications.

[17]  Laks V. S. Lakshmanan,et al.  SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[18]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[19]  Lior Seeman,et al.  Adaptive Seeding in Social Networks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[20]  Christian Borgs,et al.  Maximizing Social Influence in Nearly Optimal Time , 2012, SODA.

[21]  Thang N. Dinh,et al.  Cost-aware Targeted Viral Marketing in billion-scale networks , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[22]  Takuya Akiba,et al.  Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations , 2014, AAAI.

[23]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[24]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[25]  Xiaokui Xiao,et al.  Influence Maximization in Near-Linear Time: A Martingale Approach , 2015, SIGMOD Conference.

[26]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[27]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[28]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[29]  Aristides Gionis,et al.  STRIP: stream learning of influence probabilities , 2013, KDD.

[30]  Richard M. Karp,et al.  An optimal algorithm for Monte Carlo estimation , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[31]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.