Aspect-Level Sentiment Analysis Based on a Generalized Probabilistic Topic and Syntax Model

A number of topic models have been proposed for sentiment analysis in recent years, which rely on extensions of the basic LDA model. In this paper, we apply a generalized topic and syntax model called Part-of-Speech LDA (POSLDA) to sentiment analysis, and propose several feature selection methods that separate entities from the modifiers that describe the entities. Along with a Maximum Entropy classifier, we can use the selected features to conduct sentiment analysis at both document and aspect levels. The advantage of using POSLDA is that we can automatically separate semantic and syntactic classes, and easily extend it to aspect level sentiment analysis by mapping topics to aspects. However, words in the noun-related classes, which are also treated as semantic classes, should be removed as much as possible to reduce their impact on sentiment analysis. To evaluate the effectiveness of our solutions, we conducted experiments on two collections of review documents and obtained the accuracy results competitive to the previous work on sentiment analysis.

[1]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[2]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[3]  Noah A. Smith,et al.  Novel estimation methods for unsupervised discovery of latent structure in natural language text , 2007 .

[4]  Fei Song,et al.  Probabilistic Topic and Syntax Modeling with Part-of-Speech LDA , 2013, ArXiv.

[5]  Kamal Nigam,et al.  Retrieving topical sentiments from online document collections , 2003, IS&T/SPIE Electronic Imaging.

[6]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[7]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[8]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[9]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[10]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[11]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[12]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[13]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[14]  S. Cornish,et al.  Product Innovation and the Spatial Dynamics of Market Intelligence: Does Proximity to Markets Matter? , 1997 .

[15]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[16]  Jingbo Zhu,et al.  Aspect-based sentence segmentation for sentiment summarization , 2009, CIKM 2009.

[17]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[18]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[19]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[20]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[21]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[22]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[23]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[24]  Arjun Mukherjee,et al.  Aspect Extraction through Semi-Supervised Modeling , 2012, ACL.

[25]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[26]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.

[27]  Yue Lu,et al.  Latent aspect rating analysis without aspect keyword supervision , 2011, KDD.

[28]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[29]  Michael J. Paul,et al.  Cross-Cultural Analysis of Blogs and Forums with Mixed-Collection Topic Models , 2009, EMNLP.

[30]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[31]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[32]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[33]  Fei Song,et al.  Feature Selection for Sentiment Analysis Based on Content and Syntax Models , 2011, Decis. Support Syst..

[34]  Eric P. Xing,et al.  Staying Informed: Supervised and Semi-Supervised Multi-View Topical Analysis of Ideological Perspective , 2010, EMNLP.

[35]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[36]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[37]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[38]  T. Minka Estimating a Dirichlet distribution , 2012 .

[39]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[40]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[41]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[42]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[43]  William M. Darling Generalized Probabilistic Topic and Syntax Models for Natural Language Processing , 2012 .

[44]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[45]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[46]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[47]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[48]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[49]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[50]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.