Identifying Product Defects from User Complaints: A Probabilistic Defect Model

The recent surge in using social media has created a massive amount of unstructured textual complaints about products and services. However, discovering and quantifying potential product defects from large amounts of unstructured text is a nontrivial task. In this paper, we develop a probabilistic defect model (PDM) that identifies the most critical product issues and corresponding product attributes, simultaneously. We facilitate domain-oriented key attributes (e.g., product model, year of production, defective components, symptoms, etc.) of a product to identify and acquire integral information of defect. We conduct comprehensive evaluations including quantitative evaluations and qualitative evaluations to ensure the quality of discovered information. Experimental results demonstrate that our proposed model outperforms existing unsupervised method (K-Means Clustering), and could find more valuable information. Our research has significant managerial implications for mangers, manufacturers, and policy makers. [Category: Data and Text Mining]

[1]  Yong Liu,et al.  Does a Firm's Product-Recall Strategy Affect Its Financial Value? An Examination of Strategic Alternatives during Product-Harm Crises , 2009 .

[2]  Weiguo Fan,et al.  The power of social media analytics , 2014, CACM.

[3]  Zhijun Yan,et al.  EXPRS: An extended pagerank method for product feature extraction from online consumer reviews , 2015, Inf. Manag..

[4]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[5]  Yang Yu,et al.  The impact of social and conventional media on firm equity value: A sentiment analysis approach , 2013, Decis. Support Syst..

[6]  Yinglin Wang,et al.  Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining , 2010, ACL.

[7]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[8]  Jie Jennifer Zhang,et al.  Social Media and Firm Equity Value , 2013, Inf. Syst. Res..

[9]  Weiguo Fan,et al.  An Integrated Text Analytic Framework for Product Defect Discovery , 2015 .

[10]  Bin Wang,et al.  A probabilistic model for retrospective news event detection , 2005, SIGIR '05.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Gary King,et al.  General purpose computer-assisted clustering and conceptualization , 2011, Proceedings of the National Academy of Sciences.

[13]  Carl Vogel,et al.  Proceedings of the 16th International Conference on Computational Linguistics , 1996, COLING 1996.

[14]  Anindya Datta,et al.  Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures , 2014, Manag. Sci..

[15]  Jun Zhao,et al.  Extracting Opinion Targets and Opinion Words from Online Reviews with Graph Co-ranking , 2014, ACL.

[16]  Dylan Walker,et al.  Creating Social Contagion Through Viral Product Design: A Randomized Trial of Peer Influence in Networks , 2010, ICIS.

[17]  Yubo Chen,et al.  Online Consumer Review: Word-of-Mouth as a New Element of Marketing Communication Mix , 2004, Manag. Sci..

[18]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[19]  Weiguo Fan,et al.  Vehicle defect discovery from social media , 2012, Decis. Support Syst..

[20]  Claire Cardie,et al.  Joint Inference for Fine-grained Opinion Extraction , 2013, ACL.

[21]  Weiguo Fan,et al.  What's buzzing in the blizzard of buzz? Automotive component isolation in social media postings , 2013, Decis. Support Syst..

[22]  Yinglin Wang,et al.  Generating Aspect-oriented Multi-Document Summarization with Event-aspect model , 2011, EMNLP.

[23]  Venkatesh Saligrama,et al.  Prediction of hospitalization due to heart diseases by supervised learning methods , 2015, Int. J. Medical Informatics.

[24]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.