Clients’ Freely Written Assessment as the Source of Automatically Mined Opinions

Abstract Measuring the quality of products or services, a challenging task is to reveal clients’ satisfaction or sentiment. As people have many opportunities to express their opinions using various on-line channels (e.g., discussions, microblogs, social networks), the question is whether such data might be used for this purpose. Information hidden in the data includes the reasons why people perceive products or services as good or bad, what are the reasons of clients’ satisfaction or dissatisfaction, or what affects their sentiment. However, having the needed large amounts of data, it is hardly possible to process it manually. This paper presents a method that aims at automatic discovery of sources of human feelings hidden in textual messages that clients produce. For a demonstration, messages having a form of freely written reviews containing subjective evaluation of medical services were used. During analysis of the data, clusters representing groups of the whole reviews (or individual sentences) with a certain requested degree of similarity were created in an unsupervised manner. Then, a decision tree classifier was trained in order to find attributes (words) of the reviews that were significant for assigning the reviews to the clusters. Because individual words were sometimes not informative enough they were subsequently used as a starting point for searching for frequent multi-word expressions. As a result, the list of multi-word phrases representing frequent and important sources of clients’ opinions was presented.

[1]  Juan Luis Castro,et al.  Lexicon-based Comments-oriented News Sentiment Analyzer system , 2012, Expert Syst. Appl..

[2]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[3]  Anna-Lan Huang,et al.  Similarity Measures for Text Document Clustering , 2008 .

[4]  Darlene Fichter,et al.  Social Media Metrics : Making the Case for Making the Effort , 2008 .

[5]  George Karypis,et al.  CLUTO - A Clustering Toolkit , 2002 .

[6]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[7]  Young U. Ryu,et al.  Customer-Driven Content Recommendation Over a Network of Customers , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Ken Nishimatsu,et al.  Semantic analysis and classification method for customer enquiries in telecommunication services , 2011, Eng. Appl. Artif. Intell..

[10]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[11]  A. Peprný,et al.  The internalization of small and medium-sized enterprises in the viticulture , 2011 .

[12]  Antoon Bronselaer,et al.  Concept Identification in Constructing Multi-Document Summarizations , 2012, IPMU.

[13]  Lucy Vanderwende,et al.  Exploring Content Models for Multi-Document Summarization , 2009, NAACL.

[14]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[15]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[16]  Jan Zizka,et al.  Mining Significant Words from Customer Opinions Written in Different Natural Languages , 2011, TSD.

[17]  Jan Zizka,et al.  Mining textual significant expressions reflecting opinions in natural languages , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[18]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[19]  Kazuhiko Tsuda,et al.  The Extraction Method of the Service Improvement Information from Guests' Review , 2013, KES.

[20]  Andrea Back,et al.  Use of Web 2.0 Technology to Enhance Customer Relationships , 2008, PACIS.

[21]  Wu He,et al.  International Journal of Information Management Social Media Competitive Analysis and Text Mining: a Case Study in the Pizza Industry , 2022 .

[22]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[23]  Jan Zizka,et al.  Grouping of Customer Opinions Written in Natural Language Using Unsupervised Machine Learning , 2012, 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[24]  Jan Zizka,et al.  Revealing Prevailing Semantic Contents of Clusters Generated from Untagged Freely Written Text Documents in Natural Languages , 2013, TSD.