A machine learning approach for the identification of the deceptive reviews in the hospitality sector using unique attributes and sentiment orientation

Abstract The popularity of online reviews is causing a huge impact on consumers’ purchase intentions for goods and services. However, and hidden by the anonymity of the Internet, fraudsters can try to manipulate other consumers by posting fake reviews. Maintaining trust in online reviews require the development of automatic tools using machine learning approaches because of the huge volume of online opinions generated every day. This paper is focused on the hospitality sector and follows a content analysis approach based on a set of unique attributes and the sentiment orientation of reviews. The main contributions of the paper are i) a set of polarity-oriented unique attributes able to distinguish positive and negative deceptive and non-deceptive reviews and ii) the main topics associated to positive and negative deceptive and non-deceptive reviews. Findings reveal that positive and negative unique attributes lead to non-biased classifiers and that experience based reviews tend to be non-deceptive.

[1]  Xun Xu,et al.  The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach , 2016 .

[2]  Huaping Chen,et al.  Credibility of Electronic Word-of-Mouth: Informational and Normative Determinants of On-line Consumer Recommendations , 2009, Int. J. Electron. Commer..

[3]  Chrysanthos Dellarocas,et al.  The Sound of Silence in Online Feedback: Estimating Trading Risks in the Presence of Reporting Bias , 2006, Manag. Sci..

[4]  Jeffrey T. Hancock,et al.  Hungry like the wolf: A word‐pattern analysis of the language of psychopaths , 2013 .

[5]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[6]  Clark Hu,et al.  Analyzing Hotel Customers' E-Complaints from an Internet Complaint Forum , 2004 .

[7]  Kesari Verma,et al.  Variable Global Feature Selection Scheme for automatic classification of text documents , 2017, Expert systems with applications.

[8]  Shi Li,et al.  Identification of Deceptive Reviews by Sentimental Analysis and Characteristics of Reviewers , 2019 .

[9]  Dina Mayzlin,et al.  Promotional Reviews: An Empirical Investigation of Online Review Manipulation , 2012 .

[10]  Jeffrey T. Hancock,et al.  Linguistic Traces of a Scientific Fraud: The Case of Diederik Stapel , 2014, PloS one.

[11]  Ji-hwan Yoon,et al.  Examining national tourism brand image: content analysis of Lonely Planet Korea , 2013 .

[12]  Yimin Chen,et al.  Automatic deception detection: Methods for finding fake news , 2015, ASIST.

[13]  Chih-Ping Wei,et al.  To whom should I listen? Finding reputable reviewers in opinion-sharing communities , 2012, Decis. Support Syst..

[14]  Jun Zhang,et al.  Investigating the deceptive information in Twitter spam , 2017, Future Gener. Comput. Syst..

[15]  M. Geetha,et al.  Relationship between customer sentiment and online customer ratings for hotels - An empirical analysis , 2017 .

[16]  Christos Faloutsos,et al.  Suspicious Behavior Detection: Current Trends and Future Directions , 2016, IEEE Intelligent Systems.

[17]  Xifeng Yan,et al.  Synthetic review spamming and defense , 2013, WWW.

[18]  Ling Peng,et al.  The signaling effect of management response in engaging customers: A study of the hotel industry , 2017 .

[19]  Indranil Bose,et al.  What do hotel customers complain about? Text analysis using structural topic model , 2019, Tourism Management.

[20]  Sergio L. Toral Marín,et al.  Electronic word-of-mouth communities from the perspective of social network analysis , 2014, Technol. Anal. Strateg. Manag..

[21]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[22]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[23]  Carlos Angel Iglesias,et al.  A framework for fake review detection in online consumer electronics retailers , 2019, Inf. Process. Manag..

[24]  Hu Zhang,et al.  An Improving Deception Detection Method in Computer-Mediated Communication , 2012, J. Networks.

[25]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[26]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[27]  Faisal Muhammad Shah,et al.  Review spam detection using active learning , 2016, 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).

[28]  Sergio Toral,et al.  Identification of the Unique Attributes of Tourist Destinations from Online Reviews , 2018 .

[29]  A. Kirilenko,et al.  Comparative clustering of destination attractions for different origin markets with network and spatial analyses of online reviews , 2019, Tourism Management.

[30]  Siu Cheung Hui,et al.  Associative feature selection for text mining , 2005 .

[31]  R. Law,et al.  Hospitality and Tourism Online Reviews: Recent Trends and Future Directions , 2015 .

[32]  Michael V. Mannino,et al.  Linguistic characteristics of shill reviews , 2014, Electron. Commer. Res. Appl..

[33]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[34]  X. Zhang,et al.  Impact of Online Consumer Reviews on Sales: The Moderating Role of Product and Consumer Characteristics , 2010 .

[35]  Indranil Bose,et al.  Whose online reviews to trust? Understanding reviewer trustworthiness and its impact on business , 2017, Decis. Support Syst..

[36]  Lorin M. Hitt,et al.  Self Selection and Information Role of Online Product Reviews , 2007, Inf. Syst. Res..

[37]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[38]  Gang Luo,et al.  A review of automatic selection methods for machine learning algorithms and hyper-parameter values , 2016, Network Modeling Analysis in Health Informatics and Bioinformatics.

[39]  Taghi M. Khoshgoftaar,et al.  Survey of review spam detection using machine learning techniques , 2015, Journal of Big Data.

[40]  Francisco J. Arenas-Márquez,et al.  Identifying the features of reputable users in eWOM communities by using Particle Swarm Optimization , 2018, Technological Forecasting and Social Change.

[41]  Yang Liu,et al.  Wisdom of crowds: Conducting importance-performance analysis (IPA) through online reviews , 2019, Tourism Management.

[42]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[43]  Wolfgang Nejdl,et al.  MailRank: using ranking for spam detection , 2005, CIKM '05.

[44]  Paolo Rosso,et al.  Detecting positive and negative deceptive opinions using PU-learning , 2015, Inf. Process. Manag..

[45]  Raymond Y. K. Lau,et al.  Text mining and probabilistic language modeling for online review spam detection , 2012, TMIS.

[46]  Sergio L. Toral Marín,et al.  Harvesting Big Data in social science: A methodological approach for collecting online user-generated content , 2016, Comput. Stand. Interfaces.

[47]  Bing Liu,et al.  Spotting Fake Reviews via Collective Positive-Unlabeled Learning , 2014, 2014 IEEE International Conference on Data Mining.

[48]  F. Okumus,et al.  Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews , 2016 .

[49]  Yan Shan,et al.  How credible are online product reviews? The effects of self-generated and system-generated cues on source credibility evaluation , 2016, Comput. Hum. Behav..

[50]  Wen Zhang,et al.  DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network , 2018, Inf. Process. Manag..

[51]  Pei Pei Tan,et al.  Gaining customer knowledge in low cost airlines through text mining , 2014, Ind. Manag. Data Syst..

[52]  Philip Fei Wu In Search of Negativity Bias: An Empirical Study of Perceived Helpfulness of Online Reviews , 2013 .

[53]  Sergio Toral,et al.  Post-visit and pre-visit tourist destination image through eWOM sentiment analysis and perceived helpfulness , 2016 .

[54]  Hiram Calvo,et al.  Impact of polarity in deception detection , 2018, J. Intell. Fuzzy Syst..

[55]  Srinagesh Gavirneni,et al.  Understanding Online Hotel Reviews Through Automated Text Analysis , 2016 .

[56]  Sergio Toral,et al.  Application of text mining techniques to the analysis of discourse in eWOM communications from a gender perspective , 2018 .

[57]  D. Larcker,et al.  Detecting Deceptive Discussions in Conference Calls , 2012 .

[58]  Jonathan D. Barsky,et al.  A Strategy for Customer Satisfaction , 1992 .

[59]  Snehasish Banerjee,et al.  Dissecting Genuine and Deceptive Kudos: The Case of Online Hotel Reviews , 2014 .

[60]  Masrah Azrifah Azmi Murad,et al.  Detecting deceptive reviews using lexical and syntactic features , 2013, 2013 13th International Conference on Intellient Systems Design and Applications.

[61]  David E. Losada,et al.  Combining Psycho-linguistic, Content-based and Chat-based Features to Detect Predation in Chatrooms , 2014, J. Univers. Comput. Sci..

[62]  M. Greenacre,et al.  Multiple Correspondence Analysis and Related Methods , 2006 .