Information and Influence Propagation in Social Networks

Research on social networks has exploded over the last decade. To a large extent, this has been fueled by the spectacular growth of social media and online social networking sites, which continue growing at a very fast pace, as well as by the increasing availability of very large social network datasets for purposes of research. A rich body of this research has been devoted to the analysis of the propagation of information, influence, innovations, infections, practices and customs through networks. Can we build models to explain the way these propagations occur? How can we validate our models against any available real datasets consisting of a social network and propagation traces that occurred in the past? These are just some questions studied by researchers in this area. Information propagation models find applications in viral marketing, outbreak detection, finding key blog posts to read in order to catch important stories, finding leaders or trendsetters, information feed ranking, etc. A number of algorithmic problems arising in these applications have been abstracted and studied extensively by researchers under the garb of influence maximization. This book starts with a detailed description of well-established diffusion models, including the independent cascade model and the linear threshold model, that have been successful at explaining propagation phenomena. We describe their properties as well as numerous extensions to them, introducing aspects such as competition, budget, and time-criticality, among many others. We delve deep into the key problem of influence maximization, which selects key individuals to activate in order to influence a large fraction of a network. Influence maximization in classic diffusion models including both the independent cascade and the linear threshold models is computationally intractable, more precisely #P-hard, and we describe several approximation algorithms and scalable heuristics that have been proposed in the literature. Finally, we also deal with key issues that need to be tackled in order to turn this research into practice, such as learning the strength with which individuals in a network influence each other, as well as the practical aspects of this research including the availability of datasets and software tools for facilitating research. We conclude with a discussion of various research problems that remain open, both from a technical perspective and from the viewpoint of transferring the results of research into industry strength applications. Table of Contents: Acknowledgments / Introduction / Stochastic Diffusion Models / Influence Maximization / Extensions to Diffusion Modeling and Influence Maximization / Learning Propagation Models / Data and Software for Information/Influence: Propagation Research / Conclusion and Challenges / Bibliography / Authors' Biographies / Index

[1]  Roger Wattenhofer,et al.  Word of Mouth: Rumor Dissemination in Social Networks , 2008, SIROCCO.

[2]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[3]  Arun Sundararajan,et al.  Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks , 2009, Proceedings of the National Academy of Sciences.

[4]  Edith Cohen,et al.  Size-Estimation Framework with Applications to Transitive Closure and Reachability , 1997, J. Comput. Syst. Sci..

[5]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[6]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[7]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[8]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[9]  Vahab Mirrokni,et al.  Maximizing Non-Monotone Submodular Functions , 2007, FOCS 2007.

[10]  Robert J. Meyer,et al.  The Enhancement Bias in Consumer Decisions to Adopt and Utilize Product Innovations , 2003 .

[11]  Dong Xu,et al.  Time Constrained Influence Maximization in Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[13]  Eyal Even-Dar,et al.  A note on maximizing the spread of influence in social networks , 2007, Inf. Process. Lett..

[14]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[15]  Wei Chen,et al.  Participation Maximization Based on Social Influence in Online Discussion Forums , 2011, ICWSM.

[16]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[17]  Elchanan Mossel,et al.  Submodularity of Influence in Social Networks: From Local to Global , 2010, SIAM J. Comput..

[18]  N. Madar,et al.  Immunization and epidemic dynamics in complex networks , 2004 .

[19]  Patrick Lincoln,et al.  Epidemic profiles and defense of scale-free networks , 2003, WORM '03.

[20]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[21]  Ning Chen,et al.  On the approximability of influence in social networks , 2008, SODA '08.

[22]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[23]  Lior Rokach,et al.  Recommender Systems Handbook , 2010 .

[24]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[25]  Yifei Yuan,et al.  Scalable Influence Maximization in Social Networks under the Linear Threshold Model , 2010, 2010 IEEE International Conference on Data Mining.

[26]  Ning Zhang,et al.  Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process , 2012, AAAI.

[27]  Jon M. Kleinberg,et al.  Tracing information flow on a global scale using Internet chain-letter data , 2008, Proceedings of the National Academy of Sciences.

[28]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[29]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[30]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[31]  Ilan Newman,et al.  An exact almost optimal algorithm for target set selection in social networks , 2009, EC '09.

[32]  Michael Kearns,et al.  Competitive contagion in networks , 2011, STOC '12.

[33]  Christos Faloutsos,et al.  Epidemic thresholds in real networks , 2008, TSEC.

[34]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[35]  Jan Vondrák,et al.  Optimal approximation for the submodular welfare problem in the value oracle model , 2008, STOC.

[36]  Geoffrey Grimmett,et al.  Probability on Graphs: Frontmatter , 2010 .

[37]  Avi Ostfeld,et al.  Optimal Layout of Early Warning Detection Stations for Water Distribution Systems Security , 2004 .

[38]  Francesco Bonchi Influence Propagation in Social Networks: A Data Mining Perspective , 2011, Web Intelligence.

[39]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[41]  Tim Oates,et al.  Feeds That Matter: A Study of Bloglines Subscriptions , 2007, ICWSM.

[42]  Laks V. S. Lakshmanan,et al.  Discovering leaders from community actions , 2008, CIKM '08.

[43]  Matthew Richardson,et al.  Yes, there is a correlation: - from social networks to personal behavior on the web , 2008, WWW.

[44]  David R. Karger,et al.  Approximating s – t Minimum Cuts in ~ O(n 2 ) Time , 2007 .

[45]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[46]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[47]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[48]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[49]  Ron Kohavi,et al.  Guest Editors' Introduction: On Applied Research in Machine Learning , 1998, Machine Learning.

[50]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[51]  Jaideep Srivastava,et al.  A Generalized Linear Threshold Model for Multiple Cascades , 2010, 2010 IEEE International Conference on Data Mining.

[52]  Laks V. S. Lakshmanan,et al.  On minimizing budget and time in influence propagation over social networks , 2012, Social Network Analysis and Mining.

[53]  Jon M. Kleinberg,et al.  Does Bad News Go Away Faster? , 2011, ICWSM.

[54]  George Karakostas,et al.  A better approximation ratio for the vertex cover problem , 2005, TALG.

[55]  H. Labouret,et al.  On joking relationships , 1940, Africa.

[56]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[57]  Timothy M. Chan All-pairs shortest paths for unweighted undirected graphs in o(mn) time , 2012, TALG.

[58]  Laks V. S. Lakshmanan,et al.  Maximizing product adoption in social networks , 2012, WSDM '12.

[59]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[60]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[61]  D. Watts,et al.  Viral marketing for the real world , 2007 .

[62]  Valdis E. Krebs,et al.  Uncloaking Terrorist Networks , 2002, First Monday.

[63]  Wei Chen,et al.  Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships , 2011, WSDM.

[64]  Wei Chen,et al.  Maximizing acceptance probability for active friending in online social networks , 2013, KDD.

[65]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[66]  S. Kalish A New Product Adoption Model with Price, Advertising, and Uncertainty , 1985 .

[67]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[68]  P. Bearman,et al.  Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks1 , 2004, American Journal of Sociology.

[69]  Jimeng Sun,et al.  Social action tracking via noise tolerant time-varying factor graphs , 2010, KDD.

[70]  Jure Leskovec,et al.  Correcting for missing data in information cascades , 2011, WSDM '11.

[71]  M. Newman Threshold effects for two pathogens spreading on a network. , 2005, Physical review letters.

[72]  J. A. Barnes Class and Committees in a Norwegian Island Parish , 1954 .

[73]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[74]  Masahiro Kimura,et al.  Extracting influential nodes on a social network for information diffusion , 2009, Data Mining and Knowledge Discovery.

[75]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[76]  Laks V. S. Lakshmanan,et al.  Profit Maximization over Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[77]  Ljupco Kocarev,et al.  Model for rumor spreading over networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[78]  Shang-Hua Teng,et al.  Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs , 2010, STOC '11.

[79]  M. Macy,et al.  Complex Contagions and the Weakness of Long Ties1 , 2007, American Journal of Sociology.

[80]  Ravi Kumar,et al.  Influence and correlation in social networks , 2008, KDD.

[81]  Wei Chen,et al.  Scalable influence maximization for independent cascade model in large-scale social networks , 2012, Data Mining and Knowledge Discovery.

[82]  Lada A. Adamic,et al.  Tracking information epidemics in blogspace , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[83]  Kyomin Jung,et al.  IRIE: Scalable and Robust Influence Maximization in Social Networks , 2011, 2012 IEEE 12th International Conference on Data Mining.

[84]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[85]  Yutaka Matsuo,et al.  Community gravity: measuring bidirectional effects by trust and rating on online social networks , 2009, WWW '09.

[86]  Yin Tat Lee,et al.  An Almost-Linear-Time Algorithm for Approximate Max Flow in Undirected Graphs, and its Multicommodity Generalizations , 2013, SODA.

[87]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, SKDD.

[88]  Milind Tambe,et al.  Security Games for Controlling Contagion , 2012, AAAI.

[89]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[90]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[91]  Laks V. S. Lakshmanan,et al.  RecMax: exploiting recommender systems for fun and profit , 2012, KDD.

[92]  Jifeng Mu,et al.  Technology Adoption in Online Social Networks , 2011 .

[93]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[94]  Edward B. Royzman,et al.  Negativity Bias, Negativity Dominance, and Contagion , 2001 .

[95]  T. Schelling Micromotives and Macrobehavior , 1978 .

[96]  Jonah Sherman,et al.  Nearly Maximum Flows in Nearly Linear Time , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[97]  Kristina Lerman,et al.  Social Information Processing in News Aggregation , 2007, IEEE Internet Computing.

[98]  Masahiro Kimura,et al.  Extracting Influential Nodes for Information Diffusion on a Social Network , 2007, AAAI.

[99]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[100]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[101]  Laks V. S. Lakshmanan,et al.  The bang for the buck: fair competitive viral marketing from the host perspective , 2013, KDD.

[102]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[103]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[104]  Laks V. S. Lakshmanan,et al.  SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model , 2011, 2011 IEEE 11th International Conference on Data Mining.

[105]  Michalis Faloutsos,et al.  Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks , 2011, IEEE ICDM 2011.

[106]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[107]  Avi Ostfeld,et al.  The Battle of the Water Sensor Networks (BWSN): A Design Challenge for Engineers and Algorithms , 2008 .

[108]  Allan Borodin,et al.  Threshold Models for Competitive Influence in Social Networks , 2010, WINE.

[109]  Aristides Gionis,et al.  Sparsification of influence networks , 2011, KDD.