RETRACTED ARTICLE: A joint model for analyzing topic and sentiment dynamics from large-scale online news

Many of today’s online news websites and aggregator apps have enabled users to publish their opinions without respect to time and place. Existing works on topic-based sentiment analysis of product reviews cannot be applied to online news directly because of the following two reasons: (1) The dynamic nature of news streams require the topic and sentiment analysis model also to be dynamically updated. (2) The user interactions among news comments can easily lead to inaccurate topic extraction and sentiment classification. In this paper, we propose a novel probabilistic generative model (DTSA) to extract topics and the specified sentiments from news streams and analyze their evolution over time simultaneously. In DTSA, three different timescale models are studied to account for the historical dependencies of sentiment-topic word distributions at current epoch, continuous, skip and multiple timescale models. Additionally, we further consider the links among news comments to avoid the error caused by user interactions. In order to mine more interpretable topics, a Conditional Random Fields (CRF) model is adopted to label a set of meaningful phrases for augmenting the bag-of-word features. Finally, we derive distributed online inference procedures to update the model with newly arrived data and show the effectiveness of our proposed model on real-world data sets.

[1]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[2]  Yue Zhang,et al.  Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings , 2016, AAAI.

[3]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[4]  Jon Atle Gulla,et al.  Dynamic Topic-Based Sentiment Analysis of Large-Scale Online News , 2016, WISE.

[5]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[6]  Yasushi Sakurai,et al.  Online multiscale dynamic topic models , 2010, KDD.

[7]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[8]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[9]  Guolong Chen,et al.  Topic sentiment trend model: Modeling facets and sentiment dynamics , 2012, 2012 IEEE International Conference on Computer Science and Automation Engineering (CSAE).

[10]  Kevin C. Almeroth,et al.  Tailored news in the palm of your hand: a multi-perspective transparent approach to news recommendation , 2013, WWW.

[11]  Jon Atle Gulla,et al.  Evaluating Feature Sets and Classifiers for Sentiment Analysis of Financial News , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[12]  Jianwen Zhang,et al.  Sentiment Topic Model with Decomposed Prior , 2013, SDM.

[13]  Alice H. Oh,et al.  A Hierarchical Aspect-Sentiment Model for Online Reviews , 2013, AAAI.

[14]  Eric Gilbert,et al.  A Parsimonious Language Model of Social Media Credibility Across Disparate Events , 2017, CSCW.

[15]  Eugene Agichtein,et al.  TM-LDA: efficient online modeling of latent topic transitions in social media , 2012, KDD.

[16]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[17]  Atsushi Fujii,et al.  Extracting Condition-Opinion Relations Toward Fine-grained Opinion Mining , 2015, EMNLP.

[18]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[19]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[20]  Bruno Pouliquen,et al.  Sentiment Analysis in the News , 2010, LREC.

[21]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[22]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[23]  Stefan M. Rüger,et al.  Weakly Supervised Joint Sentiment-Topic Detection from Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[24]  Yan Zhao,et al.  Sentiment Analysis on News Comments Based on Supervised Learning Method , 2014, MUE 2014.

[25]  Bing Xiang,et al.  Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training , 2014, ACL.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Claire Cardie,et al.  Improving Agreement and Disagreement Identification in Online Discussions with A Socially-Tuned Sentiment Lexicon , 2014, WASSA@ACL.

[28]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[29]  Sabine Loudcher,et al.  A Joint Model for Topic-Sentiment Evolution over Time , 2014, 2014 IEEE International Conference on Data Mining.

[30]  Estela Saquete Boró,et al.  TimeML Events Recognition and Classification: Learning CRF Models with Semantic Roles , 2010, COLING.

[31]  Iryna Gurevych,et al.  Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields , 2010, EMNLP.