On the design of LDA models for aspect-based opinion mining

Aspect-based opinion mining, which aims to extract aspects and their corresponding ratings from customers reviews, provides very useful information for customers to make purchase decisions. In the past few years several probabilistic graphical models have been proposed to address this problem, most of them based on Latent Dirichlet Allocation (LDA). While these models have a lot in common, there are some characteristics that distinguish them from each other. These fundamental differences correspond to major decisions that have been made in the design of the LDA models. While research papers typically claim that a new model outperforms the existing ones, there is normally no "one-size-fits-all" model. In this paper, we present a set of design guidelines for aspect-based opinion mining by discussing a series of increasingly sophisticated LDA models. We argue that these models represent the essence of the major published methods and allow us to distinguish the impact of various design decisions. We conduct extensive experiments on a very large real life dataset from Epinions.com (500K reviews) and compare the performance of different models in terms of the likelihood of the held-out test set and in terms of the accuracy of aspect identification and rating prediction.

[1]  Lidong Bing,et al.  Normalizing web product attributes and discovering domain ontology with minimal effort , 2011, WSDM '11.

[2]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[3]  Zhong Su,et al.  Domain customization for aspect-oriented opinion analysis with multi-level latent sentiment clues , 2011, CIKM '11.

[4]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[5]  Xiaoyan Zhu,et al.  Sentiment Analysis with Global Topics and Local Dependency , 2010, AAAI.

[6]  Rohini K. Srihari,et al.  OpinionMiner: a novel machine learning system for web opinion mining and extraction , 2009, KDD.

[7]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[8]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[9]  Himabindu Lakkaraju,et al.  Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments , 2011, SDM.

[10]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[11]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[12]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[13]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[14]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[15]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[16]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[17]  Chun-hung Li,et al.  Semantic Dependent Word Pairs Generative Model for Fine-Grained Product Feature Mining , 2011, PAKDD.

[18]  Martin Ester,et al.  ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews , 2011, SIGIR.

[19]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[20]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[21]  Claire Cardie,et al.  Hierarchical Sequential Learning for Extracting Opinions and Their Attributes , 2010, ACL.

[22]  Yue Lu,et al.  Latent aspect rating analysis without aspect keyword supervision , 2011, KDD.

[23]  Zhong Su,et al.  Product feature categorization with multilevel latent semantic association , 2009, CIKM.

[24]  Olfa Nasraoui,et al.  Web data mining: exploring hyperlinks, contents, and usage data , 2008, SKDD.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Harith Alani,et al.  Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification , 2011, ACL.

[27]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[28]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[29]  Martin Ester,et al.  Opinion digger: an unsupervised opinion miner from unstructured product reviews , 2010, CIKM.

[30]  Yue Lu,et al.  Exploiting social context for review quality prediction , 2010, WWW '10.

[31]  Martin Ester,et al.  ETF: extended tensor factorization model for personalizing prediction of review helpfulness , 2012, WSDM '12.