Discovering Product Defects and Solutions from Online User Generated Contents

The recent increase in online user generated content (UGC) has led to the availability of a large number of posts about products and services. Often, these posts contain complaints that the consumers purchasing the products and services have. However, discovering and summarizing product defects and the related knowledge from large quantities of user posts is a difficult task. Traditional aspect opinion mining models, that aim to discover the product aspects and their corresponding opinions, are not sufficient to discover the product defect information from the user posts. In this paper, we propose the Product Defect Latent Dirichlet Allocation model (PDLDA), a probabilistic model that identifies domain-specific knowledge about product issues using interdependent three-dimensional topics: Component, Symptom, and Resolution. A Gibbs sampling based inference method for PDLDA is also introduced. To evaluate our model, we introduce three novel product review datasets. Both qualitative and quantitative evaluations show that the proposed model results in apparent improvement in the quality of discovered product defect information. Our model has the potential to benefit customers, manufacturers, and policy makers, by automatically discovering product defects from online data.

[1]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[2]  Martin Ester,et al.  On the design of LDA models for aspect-based opinion mining , 2012, CIKM.

[3]  Wei Lu,et al.  A Probabilistic Geographical Aspect-Opinion Model for Geo-Tagged Microblogs , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Bing Liu,et al.  Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data , 2014, ICML.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[7]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[8]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[9]  ChengXiang Zhai,et al.  SpecLDA: Modeling Product Reviews and Specifications to Generate Augmented Specifications , 2015, SDM.

[10]  Naphtali Rishe,et al.  Aspect and Ratings Inference with Aspect Ratings: Supervised Generative Models for Mining Hotel Reviews , 2015, WISE.

[11]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[12]  Martin Ester,et al.  ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews , 2011, SIGIR.

[13]  Yifan Sun,et al.  A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews , 2018, WWW.

[14]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[15]  Chun-hung Li,et al.  Semantic Dependent Word Pairs Generative Model for Fine-Grained Product Feature Mining , 2011, PAKDD.

[16]  Elena Tutubalina,et al.  Unsupervised Approach to Extracting Problem Phrases from User Reviews of Products , 2014 .

[17]  Weiguo Fan,et al.  Vehicle defect discovery from social media , 2012, Decis. Support Syst..

[18]  Bing Liu,et al.  Mining Aspect-Specific Opinion using a Holistic Lifelong Topic Model , 2016, WWW.

[19]  Haohong Wang,et al.  Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval , 2015, SIGIR.

[20]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[21]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[22]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[23]  Timothy Baldwin,et al.  Evaluating topic models for digital libraries , 2010, JCDL '10.

[24]  Arjun Mukherjee,et al.  Leveraging Multi-Domain Prior Knowledge in Topic Models , 2013, IJCAI.

[25]  Hao Wang,et al.  A Sentiment-aligned Topic Model for Product Aspect Rating Prediction , 2014, EMNLP.