Dynamic joint sentiment-topic model

Social media data are produced continuously by a large and uncontrolled number of users. The dynamic nature of such data requires the sentiment and topic analysis model to be also dynamically updated, capturing the most recent language use of sentiments and topics in text. We propose a dynamic Joint Sentiment-Topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic-specific word distributions are generated according to the word distributions at previous epochs. We study three different ways of accounting for such dependency information: (1) sliding window where the current sentiment-topic word distributions are dependent on the previous sentiment-topic-specific word distributions in the last S epochs; (2) skip model where history sentiment topic word distributions are considered by skipping some epochs in between; and (3) multiscale model where previous long- and short- timescale distributions are taken into consideration. We derive efficient online inference procedures to sequentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011.

[1]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[2]  E. Xing,et al.  Dynamic Non-Parametric Mixture Models and The Recurrent Chinese Restaurant Process a , 2008 .

[3]  Maurice Lorr,et al.  Profile of Mood States (POMS) , 1989 .

[4]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[5]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[6]  Stefan M. Rüger,et al.  Weakly Supervised Joint Sentiment-Topic Detection from Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[7]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[8]  Thomas L. Griffiths,et al.  Probabilistic Topic Models , 2007 .

[9]  Yasushi Sakurai,et al.  Online multiscale dynamic topic models , 2010, KDD.

[10]  Jianwen Zhang,et al.  Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora , 2010, KDD.

[11]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[12]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[13]  Hai Yang,et al.  ACM Transactions on Intelligent Systems and Technology - Special Section on Urban Computing , 2014 .

[14]  Lawrence Carin,et al.  Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[16]  Philip S. Yu,et al.  Dirichlet Process Based Evolutionary Clustering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[17]  Yulan He,et al.  Online Sentiment and Topic Dynamics Tracking over the Streaming Data , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[18]  T. Minka Estimating a Dirichlet distribution , 2012 .

[19]  Hanna Wallach,et al.  Structured Topic Models for Language , 2008 .

[20]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[21]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[22]  Philip S. Yu,et al.  Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[23]  Johan Bollen Determining the Public Mood State by Analysis of Microblogging Posts , 2010, ALIFE.

[24]  Yun Chi,et al.  Evolutionary spectral clustering by incorporating temporal smoothness , 2007, KDD '07.

[25]  Yun Chi,et al.  On evolutionary spectral clustering , 2009, TKDD.

[26]  Ramesh Nallapati,et al.  Multiscale topic tomography , 2007, KDD '07.

[27]  Deepayan Chakrabarti,et al.  Evolutionary clustering , 2006, KDD '06.

[28]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[29]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[30]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[31]  Yi Mao,et al.  Isotonic Conditional Random Fields and Local Sentiment Flow , 2006, NIPS.

[32]  Yi Mao,et al.  Generalized isotonic conditional random fields , 2009, Machine Learning.