Affinity-driven blog cascade analysis and prediction

Information propagation within the blogosphere is of much importance in implementing policies, marketing research, launching new products, and other applications. In this paper, we take a microscopic view of the information propagation pattern in blogosphere by investigating blog cascade affinity. A blog cascade is a group of posts linked together discussing about the same topic, and cascade affinity refers to the phenomenon of a blog’s inclination to join a specific cascade. We identify and analyze an array of macroscopic and microscopic content-oblivious features that may affect a blogger’s cascade joining behavior and utilize these features to predict cascade affinity of blogs. Based on these features, we present two non-probabilistic and probabilistic strategies, namely support vector machine (SVM) classification-based approach and Bipartite Markov Random Field-based (BiMRF) approach, respectively, to predict the probability of blogs’ affinity to a cascade and rank them accordingly. Evaluated on a real dataset consisting of 873,496 posts, our experimental results demonstrate that our prediction strategy can generate high quality results ($$F1$$-measure of 72.5 % for SVM and 71.1 % for BiMRF) comparing with the approaches using traditional or singular features only such as elapsed time, number of participants which is around 11.2 and 8.9 %, respectively. Our experiments also showed that among all features identified, the number of quasi-friends is the most important factor affecting bloggers’ inclination to join cascades.

[1]  Christos Faloutsos,et al.  Epidemic spreading in real networks: an eigenvalue viewpoint , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[2]  BonchiFrancesco,et al.  A data-based approach to social influence maximization , 2011, VLDB 2011.

[3]  Juan-Zi Li,et al.  Discovering the staring people from social networks , 2009, WWW '09.

[4]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Wolfgang Nejdl,et al.  Discovering information diffusion paths from blogosphere for online advertising , 2007, ADKDD '07.

[7]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[8]  Thomas Karagiannis,et al.  WWW 2009 MADRID! Track: Social Networks and Web 2.0 / Session: Diffusion and Search in Social Networks Behavioral Profiles for Advanced Email Features , 2022 .

[9]  Masahiro Kimura,et al.  Blocking links to minimize contamination spread in a social network , 2009, TKDD.

[10]  Svetha Venkatesh,et al.  Discovery of latent subcommunities in a blog's readership , 2010, TWEB.

[11]  Jun Zhu,et al.  User grouping behavior in online forums , 2009, KDD.

[12]  Michael R. Lyu,et al.  Mining social networks using heat diffusion processes for marketing candidates selection , 2008, CIKM '08.

[13]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[14]  Peter Sheridan Dodds,et al.  Universal behavior in a generalized model of contagion. , 2004, Physical review letters.

[15]  E. Rogers,et al.  Diffusion of Innovations , 1964 .

[16]  Sourav S. Bhowmick,et al.  Blog cascade affinity: analysis and prediction , 2009, CIKM.

[17]  Alessandro Vespignani,et al.  Epidemics and immunization in scale‐free networks , 2002, cond-mat/0205260.

[18]  Edward Y. Chang,et al.  AdHeat: an influence-based diffusion model for propagating hints to match ads , 2010, WWW '10.

[19]  Duncan J Watts,et al.  A simple model of global cascades on random networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[21]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[22]  Guy Kortsarz,et al.  A greedy approximation algorithm for the group Steiner problem , 2006, Discret. Appl. Math..

[23]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[24]  Scott Counts,et al.  Identifying topical authorities in microblogs , 2011, WSDM '11.

[25]  Ian Davidson,et al.  Behavioral event data and their analysis , 2012, Data Mining and Knowledge Discovery.

[26]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[27]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[28]  D. Strang,et al.  DIFFUSION IN ORGANIZATIONS AND SOCIAL MOVEMENTS: From Hybrid Corn to Poison Pills , 1998 .

[29]  Sherry Guice,et al.  Creating Communities of Readers: A Study of Children's Information Networks as Multiple Contexts for Responding to Texts , 1995 .

[30]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[31]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[32]  Marcel J. T. Reinders,et al.  The Task Dependent Effect of Tags and Ratings on Social Media Access , 2022 .

[33]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.

[34]  Haewoon Kwak,et al.  Finding influentials based on the temporal order of information adoption in twitter , 2010, WWW '10.

[35]  Vahab S. Mirrokni,et al.  Optimal marketing strategies over social networks , 2008, WWW.

[36]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[38]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[39]  Esteban Moro,et al.  Impact of human activity patterns on the dynamics of information diffusion. , 2009, Physical review letters.

[40]  J. Leskovec,et al.  Cascading Behavior in Large Blog Graphs Patterns and a model , 2006 .

[41]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[42]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[43]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[44]  Huanhuan Chen,et al.  Predictive Ensemble Pruning by Expectation Propagation , 2009, IEEE Transactions on Knowledge and Data Engineering.

[45]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[46]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[47]  C. University,et al.  Finding Patterns in Blog Shapes and Blog Evolution , 2015 .