OpinionLink: Leveraging user opinions for product catalog enrichment

Abstract A vast number of user opinions are available from reviews posted on e-commerce websites. Although these opinions are a valuable source of knowledge for both manufacturers and customers, they provide volumes of information that exceeds the human cognitive processing capacity, which can be a major bottleneck for their effective use. To address this problem, a number of opinion-summarization methods have been proposed to organize these opinions by grouping them around aspects. However, these methods tend to generate an excessive number of aspect groups that are frequently overly generic and difficult to interpret. We argue that a superior alternative would be to organize opinions around product attributes as defined in a product catalog. Typically, product attributes correspond to the most important characteristics of the products. Furthermore, they are common to all products in a given category and thus, form a more stable set than aspects. In this paper, we propose a novel approach called OpinionLink to products in a catalog at the attribute granularity level with opinions extracted from product reviews. The proposed approach is divided into two phases. In the first phase, OpinionLink uses a classifier to identify opinionated sentences in the reviews on a particular product. In the second phase, another classifier is used to map the opinions that were previously extracted from the user reviews to the attributes of the products in the product catalog. We performed a series of experiments on these phases. For the first phase, our experiments indicated that using classifiers with the proposed features achieved an average of 0.87 in terms of F1 measure for the task of identifying opinionated sentences. In the second phase, the method we proposed for the opinion-mapping task achieved an average of 0.85 in terms of F1. Further, we verified the effectiveness of the proposed approach as a realistic end-to-end application, indicating that we can use OpinionLink in a real setting. Finally, we empirically demonstrate the feasibility of using the proposed approach with an extremely large volume of opinions available in a collection of more than 600,000 real reviews. We also set forth a number of directions for future research.

[1]  Thiago Alexandre Salgueiro Pardo,et al.  Opinion summarization methods: Comparing and extending extractive and abstractive approaches , 2017, Expert Syst. Appl..

[2]  Min Song,et al.  An adaptable fine-grained sentiment analysis for summarization of multiple short online reviews , 2017, Data Knowl. Eng..

[3]  Shuai Wang,et al.  Targeted Topic Modeling for Focused Analysis , 2016, KDD.

[4]  Niloy Ganguly,et al.  A Novel Two-stage Framework for Extracting Opinionated Sentences from News Articles , 2014, TextGraphs@EMNLP.

[5]  David E. Losada,et al.  An empirical study of sentence features for subjectivity and polarity classification , 2014, Inf. Sci..

[6]  Bing Liu,et al.  Identifying comparative sentences in text documents , 2006, SIGIR.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  S. K. Saritha,et al.  Methods for Identifying Comparative Sentences , 2014 .

[9]  Jure Leskovec,et al.  Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[10]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[11]  Joshua Goodman,et al.  Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[12]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[13]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[14]  Feifei Li,et al.  OpenTag: Open Attribute Value Extraction from Product Profiles , 2018, KDD.

[15]  Bum Chul Kwon,et al.  Do People Really Experience Information Overload While Reading Online Reviews? , 2015, Int. J. Hum. Comput. Interact..

[16]  Juyoung Kang,et al.  Analyzing the discriminative attributes of products using text mining focused on cosmetic reviews , 2018, Inf. Process. Manag..

[17]  Kang Liu,et al.  Book Review: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions by Bing Liu , 2015, CL.

[18]  Surajit Chaudhuri,et al.  InfoGather: entity augmentation and attribute discovery by holistic matching with web tables , 2012, SIGMOD Conference.

[19]  Heiner Stuckenschmidt,et al.  Enriching Structured Knowledge with Open Information , 2015, WWW.

[20]  Madhavi Devaraj,et al.  Analytical mapping of opinion mining and sentiment analysis research during 2000-2015 , 2017, Inf. Process. Manag..

[21]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[22]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[23]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[24]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[25]  Yifan Sun,et al.  A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews , 2018, WWW.

[26]  ChengXiang Zhai,et al.  Comprehensive Review of Opinion Summarization , 2011 .

[27]  Xiaojun Wan,et al.  CMiner: Opinion Extraction and Summarization for Chinese Microblogs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[28]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[29]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[30]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[31]  Meng Wang,et al.  Product Aspect Ranking and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.

[32]  Kim Schouten,et al.  Survey on Aspect-Level Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[33]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[34]  Tai-Yue Wang,et al.  Fuzzy support vector machine for multi-class text categorization , 2007, Inf. Process. Manag..

[35]  Dieter Fensel,et al.  Product Data Integration in B2B E-Commerce , 2001, IEEE Intell. Syst..

[36]  Catherine Blake,et al.  Identifying Comparative Claim Sentences in Full-Text Scientific Articles , 2012, ACL 2012.

[37]  Meng Wang,et al.  Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews , 2011, EMNLP.

[38]  Girish Keshav Palshikar,et al.  Learning to Identify Subjective Sentences , 2016, ICON.

[39]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[40]  Yen-Liang Chen,et al.  Opinion mining from online hotel reviews - A text summarization approach , 2017, Inf. Process. Manag..

[41]  Walter Daelemans,et al.  Pattern for Python , 2012, J. Mach. Learn. Res..

[42]  Chen Gui,et al.  A Rule-Based Approach to Aspect Extraction from Product Reviews , 2014, SocialNLP@COLING.

[43]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[44]  Rahul Gupta,et al.  Mining Subjective Properties on the Web , 2015, SIGMOD Conference.