Model-based non-Gaussian interest topic distribution for user retweeting in social networks

Abstract Retweeting behavior is critical to dissect information diffusion, innovation propagation and events bursting in networks. However, because of the various contents of tweets, recent work mainly focuses on the influential relationship while unable to derive different pathways of information diffusion. Therefore, our work tries to reveal the pattern by tracking retweeting behavior through user interest and categories of tweets. The key for modeling user interest is modeling topic distribution of tweets, which have non-Gaussian characteristics (e.g., power law distribution), thus we present the Latent Topics of user Interest(LTI) model which make full use of the non-Gaussian distribution of topics among tweets to uncover user interest and then predict users’ possible actions. After dividing users into conceit users and altruism users by whether they have definite selection when retweeting, and categorizing tweets into repeated hot tweets and novel hot tweets by whether its topics always occur in the training set, we demonstrates a pattern – the conceit users promotes the diffusion of repeated hot tweets, whereas the altruism users expands the diffusion of novel hot tweets, and the pattern is evaluated by the correlation coefficient between types of users and tweets, which is greater than .61 for 10 and 100 million tweets of Weibo 2 and Twitter with respect to 70 and 58 thousand users over a period of one month.

[1]  Lada A. Adamic,et al.  Tracking information epidemics in blogspace , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[2]  Damon Centola,et al.  The Spread of Behavior in an Online Social Network Experiment , 2010, Science.

[3]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[4]  Jianxin Li,et al.  Efficient Nonparametric Subgraph Detection Using Tree Shaped Priors , 2016, AAAI.

[5]  Jinpeng Huai,et al.  Ring: Real-Time Emerging Anomaly Monitoring System Over Text Streams , 2019, IEEE Transactions on Big Data.

[6]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[7]  Rossano Schifanella,et al.  The role of information diffusion in the evolution of social networks , 2013, KDD.

[8]  Yang Zhang,et al.  Modeling user posting behavior on social media , 2012, SIGIR '12.

[9]  Markus Flierl,et al.  Bayesian estimation of Dirichlet mixture model with variational inference , 2014, Pattern Recognit..

[10]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  H. Ohtsuki,et al.  A simple rule for the evolution of cooperation on graphs and social networks , 2006, Nature.

[13]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[14]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[15]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[16]  Yamir Moreno,et al.  The Dynamics of Protest Recruitment through an Online Network , 2011, Scientific reports.

[17]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[19]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[20]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[21]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[22]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[23]  Vikas Sindhwani,et al.  Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization , 2012, WSDM '12.

[24]  Jiawei Han,et al.  Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling , 2012, TIST.

[25]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[26]  Honggang Zhang,et al.  Variational Bayesian Matrix Factorization for Bounded Support Data , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Zhen Yang,et al.  Decorrelation of Neutral Vector Variables: Theory and Applications , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[29]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[30]  Ah Reum Kang,et al.  The contagion of malicious behaviors in online games , 2013, SIGCOMM.

[31]  Daniel B. Neill,et al.  Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs , 2014, KDD.

[32]  J. Krosnick,et al.  Origins of attitude importance: self-interest, social identification, and value relevance. , 1995, Journal of personality and social psychology.

[33]  Jure Leskovec,et al.  Information diffusion and external influence in networks , 2012, KDD.

[34]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[35]  Roelof van Zwol,et al.  Individual behavior and social influence in online social systems , 2011, HT '11.

[36]  Jure Leskovec,et al.  Modeling Information Diffusion in Implicit Networks , 2010, 2010 IEEE International Conference on Data Mining.

[37]  Wei Shen,et al.  Linking named entities in Tweets with knowledge base via user interest modeling , 2013, KDD.

[38]  Cécile Favre,et al.  Information diffusion in online social networks: a survey , 2013, SGMD.