Modeling Topics and Behavior of Microbloggers

Microblogging encompasses both user-generated content and behavior. When modeling microblogging data, one has to consider personal and background topics, as well as how these topics generate the observed content and behavior. In this article, we propose the Generalized Behavior-Topic (GBT) model for simultaneously modeling background topics and users’ topical interest in microblogging data. GBT considers multiple topical communities (or realms) with different background topical interests while learning the personal topics of each user and the user’s dependence on realms to generate both content and behavior. This differentiates GBT from other previous works that consider either one realm only or content data only. By associating user behavior with the latent background and personal topics, GBT helps to model user behavior by the two types of topics. GBT also distinguishes itself from other earlier works by modeling multiple types of behavior together. Our experiments on two Twitter datasets show that GBT can effectively mine the representative topics for each realm. We also demonstrate that GBT significantly outperforms other state-of-the-art models in modeling content topics and user profiling.

[1]  Hong Cheng,et al.  The dual-sparse topic model: mining focused topics and focused terms in short text , 2014, WWW.

[2]  William W. Cohen,et al.  From Topic Models to Semi-supervised Learning: Biasing Mixed-Membership Models to Exploit Topic-Indicative Features in Entity Clustering , 2013, ECML/PKDD.

[3]  LimEe-Peng,et al.  Modeling Topics and Behavior of Microbloggers , 2017 .

[4]  Pengtao Xie,et al.  Integrating Document Clustering and Topic Modeling , 2013, UAI.

[5]  Hua Lu,et al.  A unified model for stable and temporal topic detection from social media data , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[6]  Mirella Lapata,et al.  Tweet Recommendation with Graph Co-Ranking , 2012, ACL.

[7]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[8]  Jacob Ratkiewicz,et al.  Political Polarization on Twitter , 2011, ICWSM.

[9]  William W. Cohen,et al.  Regularization of Latent Variable Models to Obtain Sparsity , 2013, SDM.

[10]  Nicola Barbieri,et al.  Who to follow and why: link prediction with explanations , 2014, KDD.

[11]  Aristides Gionis,et al.  From chatter to headlines: harnessing the real-time web for personalized news recommendation , 2012, WSDM '12.

[12]  Xiaoming Li,et al.  Infer User Interests via Link Structure Regularization , 2014, TIST.

[13]  Di Jiang,et al.  Integrating Social and Auxiliary Semantics for Multifaceted Topic Modeling in Twitter , 2014, TOIT.

[14]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[15]  Ting Wang,et al.  Who will retweet me?: finding retweeters in twitter , 2013, SIGIR.

[16]  Haewoon Kwak,et al.  Fragile online relationship: a first look at unfollow dynamics in twitter , 2011, CHI.

[17]  Zhiqiang Ma,et al.  Tag-Latent Dirichlet Allocation: Understanding Hashtags and Their Relationships , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[18]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[19]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[20]  Yong Yu,et al.  Collaborative personalized tweet recommendation , 2012, SIGIR '12.

[21]  Onkar Dabeer,et al.  Timing Tweets to Increase Effectiveness of Information Campaigns , 2021, ICWSM.

[22]  John Hannon,et al.  Recommending twitter users to follow using content and collaborative filtering approaches , 2010, RecSys '10.

[23]  Timothy W. Finin,et al.  Why We Twitter: An Analysis of a Microblogging Community , 2009, WebKDD/SNA-KDD.

[24]  Bo Pang,et al.  The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter , 2014, ACL.

[25]  Noah A. Smith,et al.  Predicting Response to Political Blog Posts with Topic Models , 2009, NAACL.

[26]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[27]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[28]  Hongyuan Zha,et al.  Probabilistic models for discovering e-communities , 2006, WWW '06.

[29]  Jacob Ratkiewicz,et al.  Predicting the Political Alignment of Twitter Users , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[30]  Ee-Peng Lim,et al.  On Modeling Community Behaviors and Sentiments in Microblogging , 2014, SDM.

[31]  Bing He,et al.  Community-based topic modeling for social tagging , 2010, CIKM.

[32]  Junghoo Cho,et al.  Topical semantics of twitter links , 2011, WSDM '11.

[33]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[34]  Shuang-Hong Yang,et al.  Large-scale high-precision topic modeling on twitter , 2014, KDD.

[35]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[36]  Isabell M. Welpe,et al.  Divided They Tweet: The Network Structure of Political Microbloggers and Discussion Topics , 2011, ICWSM.

[37]  Deborah A. Prentice,et al.  Asymmetries in Attachments to Groups and to their Members: Distinguishing between Common-Identity and Common-Bond Groups , 1994 .

[38]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[39]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[40]  Brian D. Davison,et al.  Structural link analysis and prediction in microblogs , 2011, CIKM '11.

[41]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[42]  Ying Ding,et al.  Community detection: Topological vs. topical , 2011, J. Informetrics.

[43]  Jiafeng Guo,et al.  BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.

[44]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[45]  Fei Wang,et al.  ET-LDA: Joint Topic Modeling For Aligning, Analyzing and Sensemaking of Public Events and Their Twitter Feeds , 2012, ArXiv.

[46]  Hai Yang,et al.  ACM Transactions on Intelligent Systems and Technology - Special Section on Urban Computing , 2014 .

[47]  Feida Zhu,et al.  It Is Not Just What We Say, But How We Say Them: LDA-based Behavior-Topic Model , 2013, SDM.

[48]  Scott Counts,et al.  Predicting the Speed, Scale, and Range of Information Diffusion in Twitter , 2010, ICWSM.

[49]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks , 2005, IJCAI.

[50]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Xiaohui Yan,et al.  A biterm topic model for short texts , 2013, WWW.

[52]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[53]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[54]  Eric P. Xing,et al.  Spatial compactness meets topical consistency: jointly modeling links and content for community detection , 2014, WSDM.

[55]  Michele Zappavigna,et al.  Ambient affiliation: A linguistic perspective on Twitter , 2011, New Media Soc..

[56]  Yong Yu,et al.  Diffusion-aware personalized social update recommendation , 2013, RecSys.

[57]  Jure Leskovec,et al.  Detecting cohesive and 2-mode communities indirected and undirected networks , 2014, WSDM.

[58]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[59]  Víctor M. Eguíluz,et al.  Distinguishing topical and social groups based on common identity and bond theory , 2013, WSDM.

[60]  Scott Sanner,et al.  Improving LDA topic models for microblogs via tweet pooling and automatic labeling , 2013, SIGIR.

[61]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[62]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Yang Li,et al.  Interpreting the Public Sentiment Variations on Twitter , 2014, IEEE Transactions on Knowledge and Data Engineering.

[64]  Seunghak Lee,et al.  More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.

[65]  Wray L. Buntine,et al.  Topic Model : Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon , 2014 .

[66]  Peng Li,et al.  Joint topic modeling for event summarization across news and social media streams , 2012, CIKM.

[67]  Lifeng Sun,et al.  Who should share what?: item-level social influence prediction for users and posts ranking , 2011, SIGIR.

[68]  Lei Yang,et al.  We know what @you #tag: does the dual role affect hashtag adoption? , 2012, WWW.

[69]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[70]  Krishna P. Gummadi,et al.  The Emergence of Conventions in Online Social Networks , 2012, ICWSM.

[71]  Jiawei Han,et al.  Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling , 2012, TIST.

[72]  Jing Jiang,et al.  A Unified Model for Topics, Events and Users on Twitter , 2013, EMNLP.

[73]  L. Venkata Subramaniam,et al.  Using content and interactions for discovering communities in social networks , 2012, WWW.

[74]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[75]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[76]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[77]  Kwan Hui Lim,et al.  Following the follower: detecting communities with common interests on twitter , 2012, HT '12.

[78]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[79]  Mary Beth Rosson,et al.  How and why people Twitter: the role that micro-blogging plays in informal communication at work , 2009, GROUP.

[80]  Krishna P. Gummadi,et al.  Predicting emerging social conventions in online social networks , 2012, CIKM.

[81]  Antoine Boutet,et al.  What's in Your Tweets? I Know Who You Supported in the UK 2010 General Election , 2012, ICWSM.

[82]  Gao Cong,et al.  A Tri-Role Topic Model for Domain-Specific Question Answering , 2015, AAAI.

[83]  Matthew Michelson,et al.  Tweet Disambiguate Entities Retrieve Folksonomy SubTree Step 1 : Discover Categories Generate Topic Profile from SubTrees Step 2 : Discover Profile Topic Profile : “ English Football ” “ World Cup ” , 2010 .

[84]  Liangjie Hong,et al.  A time-dependent topic model for multiple text streams , 2011, KDD.