Product aspect extraction supervised with online domain knowledge

One of the most challenging problems in aspect-based opinion mining is aspect extraction, which aims to identify expressions that describe aspects of products (called aspect expressions) and categorize domain-specific synonymous expressions. Although a number of methods of aspect extraction have been proposed before, very few of them are designed to improve the interpretability of generated aspects. Existing methods either generate multiple fine-grained aspects without proper categorization or categorize semantically unrelated product aspects (e.g., by unsupervised topic modeling). In this paper, we first examine previous studies on product aspect extraction. To overcome the limitations of existing methods, two novel semi-supervised models for product aspect extraction are then proposed. More specifically, the proposed methodology first extracts seeding aspects and related terms from detailed product descriptions readily available on E-commerce websites. Next, product reviews are regrouped according to these seeding aspects so that more effective textual contexts for topic modeling are built. Finally, two novel semi-supervised topic models are developed to extract human-comprehensible product aspects. For the first proposed topic model, the Fine-grained Labeled LDA (FL-LDA), seeding aspects are applied to guide the model to discover words that are related to these seeding aspects. For the second model, the Unified Fine-grained Labeled LDA (UFL-LDA), we incorporate unlabeled documents to extend the FL-LDA model so that words related to the seeding aspects or other high-frequency words in customer reviews are extracted. Our experimental results demonstrate that the proposed methods outperform state-of-the-art methods.

[1]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[2]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[3]  Susan T. Dumais,et al.  Partially labeled topic models for interpretable text mining , 2011, KDD.

[4]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[5]  Martin Ester,et al.  ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews , 2011, SIGIR.

[6]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[7]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[8]  Mitsuru Ishizuka,et al.  Graph-based Word Clustering using a Web Search Engine , 2006, EMNLP.

[9]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[10]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[11]  Erik Cambria,et al.  Sentic Activation: A Two-Level Affective Common Sense Reasoning Framework , 2012, AAAI.

[12]  Arjun Mukherjee,et al.  Aspect Extraction through Semi-Supervised Modeling , 2012, ACL.

[13]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[14]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[15]  Jiahao Zhang,et al.  Sample cutting method for imbalanced text sentiment classification based on BRC , 2013, Knowl. Based Syst..

[16]  Zheng Lin,et al.  Towards jointly extracting aspects and aspect-specific sentiment knowledge , 2012, CIKM.

[17]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[18]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[19]  Jane Yung-jen Hsu,et al.  Building a Concept-Level Sentiment Dictionary Based on Commonsense Knowledge , 2013, IEEE Intelligent Systems.

[20]  Bing Liu,et al.  Opinion Feature Extraction Using Class Sequential Rules , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[21]  Xuanjing Huang,et al.  Phrase Dependency Parsing for Opinion Mining , 2009, EMNLP.

[22]  Meng Wang,et al.  Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews , 2011, ACL.

[23]  Zhong Su,et al.  Product feature categorization with multilevel latent semantic association , 2009, CIKM.

[24]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[25]  A. McCallum,et al.  Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[26]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.

[27]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[28]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[29]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[30]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[31]  Gregor Heinrich,et al.  A Generic Approach to Topic Models , 2009, ECML/PKDD.

[32]  Diego Reforgiato Recupero,et al.  Frame-Based Detection of Opinion Holders and Topics: A Model and a Tool , 2014, IEEE Computational Intelligence Magazine.

[33]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[34]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[35]  Li Chen,et al.  Preference-based clustering reviews for augmenting e-commerce recommendation , 2013, Knowl. Based Syst..

[36]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[37]  D. Newman,et al.  THE DISTRIBUTION OF RANGE IN SAMPLES FROM A NORMAL POPULATION, EXPRESSED IN TERMS OF AN INDEPENDENT ESTIMATE OF STANDARD DEVIATION , 1939 .

[38]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[39]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[40]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[41]  Hua Xu,et al.  Constrained LDA for Grouping Product Features in Opinion Mining , 2011, PAKDD.

[42]  Erik Cambria,et al.  SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis , 2014, AAAI.

[43]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[44]  Hua Xu,et al.  Clustering product features for opinion mining , 2011, WSDM '11.

[45]  Martin Ester,et al.  The FLDA model for aspect-based opinion mining: addressing the cold start problem , 2013, WWW.

[46]  Fadi Biadsy,et al.  Contextual Phrase-Level Polarity Analysis Using Lexical Affect Scoring and Syntactic N-Grams , 2009, EACL.

[47]  Raymond Y. K. Lau,et al.  A Probabilistic Generative Model for Mining Cybercriminal Networks from Online Social Media , 2014, IEEE Computational Intelligence Magazine.

[48]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[49]  Meng Wang,et al.  Product Aspect Ranking and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.

[50]  Erik Cambria,et al.  SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis , 2012, FLAIRS.

[51]  Padhraic Smyth,et al.  Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.

[52]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Giuseppe Carenini,et al.  Extracting knowledge from evaluative text , 2005, K-CAP '05.

[54]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[55]  Erik Cambria,et al.  Application of multi-dimensional scaling and artificial neural networks for biologically inspired opinion mining , 2013, BICA 2013.

[56]  Dipankar Das,et al.  Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining , 2013, IEEE Intelligent Systems.