Integrating word status for joint detection of sentiment and aspect in reviews

A crucial task in sentiment analysis is aspect detection: the step of selecting the aspects on which opinions are expressed. This step anticipates the step of determining whether the opinions on aspects are positive or negative. This article proposes a novel probabilistic generative topic model for aspect-based sentiment analysis which is able to discover the latent structure of a large collection of review documents. The proposed joint sentiment-aspect detection model (SAM) is a generative topic model that incorporates the structure of review sentences for detecting aspects and sentiments simultaneously. The intuitions behind the SAM are that from generating documents by latent single- and multi-word topics, modelling the word distribution for each topic and learning of the prior distribution over topics in sentences of documents. SAM introduces word status so that the model can decide when to sample from a bigram distribution or a unigram distribution and integrates all these components into one combined model for aspect-based sentiment analysis. We evaluate SAM both qualitatively and quantitatively to show that the model is indeed able to perform the task effectively and improves significantly over standard joint sentiment-aspect models. The proposed model can easily be transformed between domains or languages and can detect the polarity of text data at various levels. However, for the quantitative analysis, we mainly focus on presenting the results for the document-level sentiment classification.

[1]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[2]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[3]  Mitsuru Ishizuka,et al.  SentiFul: A Lexicon for Sentiment Analysis , 2011, IEEE Transactions on Affective Computing.

[4]  Xiaoyan Zhu,et al.  Sentiment Analysis with Global Topics and Local Dependency , 2010, AAAI.

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[9]  Philip Resnik,et al.  GIBBS SAMPLING FOR THE UNINITIATED , 2010 .

[10]  Hua Xu,et al.  Constrained LDA for Grouping Product Features in Opinion Mining , 2011, PAKDD.

[11]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[12]  Patricio Martínez-Barco,et al.  Opinion Question Answering: Towards a Unified Approach , 2010, ECAI.

[13]  Claire Cardie,et al.  Hierarchical Sequential Learning for Extracting Opinions and Their Attributes , 2010, ACL.

[14]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.

[15]  Chunyan Miao,et al.  Analyzing Sentiments in One Go: A Supervised Joint Topic Modeling Approach , 2017, IEEE Transactions on Knowledge and Data Engineering.

[16]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[17]  Anh-Cuong Le,et al.  Learning multiple layers of knowledge representation for aspect based sentiment analysis , 2017, Data Knowl. Eng..

[18]  Ayoub Bagheri,et al.  Feature Selection Methods in Persian Sentiment Analysis , 2013, NLDB.

[19]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[20]  Franciska de Jong,et al.  ADM-LDA: An aspect detection model based on topic modelling using the structure of review sentences , 2014, J. Inf. Sci..

[21]  Kun Yang,et al.  Dynamic non-parametric joint sentiment topic mixture model , 2015, Knowl. Based Syst..

[22]  Jesús Vegas,et al.  Automatic identification of light stop words for Persian information retrieval systems , 2014, J. Inf. Sci..

[23]  A. McCallum,et al.  Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[24]  Cheng Xueqi,et al.  Aspect-level opinion mining of online customer reviews , 2013, China Communications.

[25]  Meng Wang,et al.  Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews , 2011, ACL.

[26]  Stefan M. Rüger,et al.  Weakly Supervised Joint Sentiment-Topic Detection from Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[27]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[28]  Erik Cambria,et al.  Aspect extraction for opinion mining with a deep convolutional neural network , 2016, Knowl. Based Syst..

[29]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[30]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[31]  Amélie Marian,et al.  Beyond the Stars: Improving Rating Predictions using Review Text Content , 2009, WebDB.

[32]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[33]  David M. Blei,et al.  Introduction to Probabilistic Topic Models , 2010 .

[34]  Christopher S. G. Khoo,et al.  Aspect-based sentiment analysis of movie reviews on discussion boards , 2010, J. Inf. Sci..

[35]  Vincent Ng,et al.  Topic-wise, Sentiment-wise, or Otherwise? Identifying the Hidden Dimension for Unsupervised Text Classification , 2009, EMNLP.

[36]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[37]  Guilin Qi,et al.  A Joint Model for Sentiment-Aware Topic Detection on Social Media , 2016, ECAI.

[38]  Fabio Crestani,et al.  Proximity-based opinion retrieval , 2010, SIGIR '10.

[39]  Michal Rosen-Zvi,et al.  Hidden Topic Markov Models , 2007, AISTATS.

[40]  Jesús Vegas,et al.  How well does Google work with Persian documents? , 2017, J. Inf. Sci..

[41]  Jean Cassou Du voyage au tourisme , 1967 .

[42]  Franciska de Jong,et al.  Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews , 2013, Knowl. Based Syst..

[43]  Aoying Zhou,et al.  SentiView: Sentiment Analysis and Visualization for Internet Popular Topics , 2013, IEEE Transactions on Human-Machine Systems.

[44]  Barbara Plank,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies , 2011 .

[45]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[46]  Franciska de Jong,et al.  An Unsupervised Aspect Detection Model for Sentiment Analysis of Reviews , 2013, NLDB.

[47]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[48]  T. Minka Estimating a Dirichlet distribution , 2012 .

[49]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[50]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[51]  Martin Ester,et al.  ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews , 2011, SIGIR.

[52]  Andrea Esuli,et al.  Multi-Faceted Rating of Product Reviews , 2009, ERCIM News.

[53]  Jingbo Zhu,et al.  Aspect-Based Opinion Polling from Customer Reviews , 2011, IEEE Transactions on Affective Computing.