Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining

Fake consumer review detection has attracted much interest in recent years owing to the increasing number of Internet purchases. Existing approaches to detect fake consumer reviews use the review content, product and reviewer information and other features to detect fake reviews. However, as shown in recent studies, the semantic meaning of reviews might be particularly important for text classification. In addition, the emotions hidden in the reviews may represent another potential indicator of fake content. To improve the performance of fake review detection, here we propose two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions. Specifically, the models learn document-level representation by using three sets of features: (1) n -grams, (2) word embeddings and (3) various lexicon-based emotion indicators. Such a high-dimensional feature representation is used to classify fake reviews into four domains. To demonstrate the effectiveness of the presented detection systems, we compare their classification performance with several state-of-the-art methods for fake review detection. The proposed systems perform well on all datasets, irrespective of their sentiment polarity and product category.

[1]  SalimNaomie,et al.  Detection of review spam , 2015 .

[2]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[3]  Manisha Sharma,et al.  Spam Detection on Social Media Using Semantic Convolutional Neural Network , 2018, Int. J. Knowl. Discov. Bioinform..

[4]  Carlos Angel Iglesias,et al.  A framework for fake review detection in online consumer electronics retailers , 2019, Inf. Process. Manag..

[5]  Issa Traore,et al.  Detecting opinion spams and fake news using text classification , 2018, Secur. Priv..

[6]  Rishi Chandy,et al.  Identifying spam in the iOS app store , 2012, WebQuality '12.

[7]  Avinash Chandra Pandey,et al.  Spam review detection using spiral cuckoo search clustering method , 2019, Evolutionary Intelligence.

[8]  Shakeel Ahmad,et al.  Opinion spam detection framework using hybrid classification scheme , 2019, Soft Computing.

[9]  Chiew Tong Lau,et al.  A study on real-time low-quality content detection on Twitter from the users’ perspective , 2017, PloS one.

[10]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[11]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[12]  Aliaksandr Barushka,et al.  Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks , 2018, Applied Intelligence.

[13]  Naomie Salim,et al.  Detection of review spam: A survey , 2015, Expert Syst. Appl..

[14]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[15]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[16]  Rakesh Patel,et al.  A Survey on Fake Review Detection using Machine Learning Techniques , 2018, 2018 4th International Conference on Computing Communication and Automation (ICCCA).

[17]  Santhosh Kumar,et al.  Temporal Opinion Spam Detection by Multivariate Indicative Signals , 2016, ICWSM.

[18]  Dong-Hong Ji,et al.  Neural networks for deceptive opinion spam detection: An empirical study , 2017, Inf. Sci..

[19]  Xiaolong Wang,et al.  Opinion spam detection by incorporating multimodal embedded representation into a probabilistic review graph , 2019, Neurocomputing.

[20]  Masrah Azrifah Azmi Murad,et al.  Detecting deceptive reviews using lexical and syntactic features , 2013, 2013 13th International Conference on Intellient Systems Design and Applications.

[21]  Ayyaz Hussain,et al.  Helpfulness of product reviews as a function of discrete positive and negative emotions , 2017, Comput. Hum. Behav..

[22]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[23]  Christopher G. Harris Detecting Deceptive Opinion Spam Using Human Computation , 2012, HCOMP@AAAI.

[24]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[25]  Sreekanth Madisetty,et al.  A Neural Network-Based Ensemble Approach for Spam Detection in Twitter , 2018, IEEE Transactions on Computational Social Systems.

[26]  Arjun Mukherjee,et al.  Analyzing and Detecting Opinion Spam on a Large-scale Dataset via Temporal and Spatial Patterns , 2015, ICWSM.

[27]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[28]  Tieyun Qian,et al.  Generating Behavior Features for Cold-Start Spam Review Detection , 2019, DASFAA.

[29]  Taghi M. Khoshgoftaar,et al.  Survey of review spam detection using machine learning techniques , 2015, Journal of Big Data.

[30]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[33]  Athena Vakali,et al.  Harvesting Opinions and Emotions from Social Media Textual Resources , 2015, IEEE Internet Computing.

[34]  Aliaksandr Barushka,et al.  Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks , 2019, Neural Computing and Applications.

[35]  Bing Liu,et al.  Analyzing and Detecting Review Spam , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[36]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[37]  Ting Wang,et al.  Pinning Cluster Synchronization in Linear Hybrid Coupled Delayed Dynamical Networks , 2016 .

[38]  Aliaksandr Barushka,et al.  Spam Filtering in Social Networks Using Regularized Deep Neural Networks with Ensemble Learning , 2018, AIAI.

[39]  Leman Akoglu,et al.  Collective Opinion Spam Detection: Bridging Review Networks and Metadata , 2015, KDD.

[40]  Zhi-Yuan Zeng,et al.  A Review Structure Based Ensemble Model for Deceptive Review Spam , 2019, Inf..

[41]  Weixiang Shao,et al.  Bimodal Distribution and Co-Bursting in Review Spam Detection , 2017, WWW.

[42]  Qingxi Peng,et al.  Detecting Spam Review through Sentiment Analysis , 2014, J. Softw..

[43]  Kim-Kwang Raymond Choo,et al.  Revisiting Semi-Supervised Learning for Online Deceptive Review Detection , 2017, IEEE Access.

[44]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[45]  Chengai Sun,et al.  Exploiting Product Related Review Features for Fake Review Detection , 2016 .

[46]  Neamat El-Tazi,et al.  Sentiment Analysis over Social Networks: An Overview , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[47]  Jitendra Kumar Rout,et al.  A Framework for Fake Review Detection: Issues and Challenges , 2018, 2018 International Conference on Information Technology (ICIT).

[48]  Julian J. McAuley,et al.  Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering , 2016, WWW.

[49]  Bo Pang,et al.  A unified framework for detecting author spamicity by modeling review deviation , 2018, Expert Syst. Appl..

[50]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[51]  Aliaksandr Barushka,et al.  Spam Filtering Using Regularized Neural Networks with Rectified Linear Units , 2016, AI*IA.

[52]  Asoka S. Karunananda,et al.  Deceptive consumer review detection: a survey , 2019, Artificial Intelligence Review.

[53]  Teruo Higashino,et al.  Twitter user profiling based on text and community mining for market analysis , 2013, Knowl. Based Syst..

[54]  David Gil,et al.  A framework for big data analytics in commercial social networks: A case study on sentiment analysis and fake review detection for marketing decision-making , 2019 .

[55]  Saad A. Alhoqail,et al.  How Online Product Reviews Affect Retail Sales: A Meta-analysis , 2014 .

[56]  Hamid Turab Mirza,et al.  Spam Review Detection Techniques: A Systematic Literature Review , 2019, Applied Sciences.

[57]  Raymond Y. K. Lau,et al.  Text mining and probabilistic language modeling for online review spam detection , 2012, TMIS.

[58]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[59]  Saif Mohammad,et al.  Determining Word-Emotion Associations from Tweets by Multi-label Classification , 2016, 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[60]  Luyang Li,et al.  Document representation and feature combination for deceptive spam review detection , 2017, Neurocomputing.

[61]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[62]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[63]  Alexandros Nanopoulos,et al.  The Role of Emotions for the Perceived Usefulness in Online Customer Reviews , 2016 .

[64]  Abdelouahed Gherbi,et al.  An empirical study on detecting fake reviews using machine learning techniques , 2017, 2017 Seventh International Conference on Innovative Computing Technology (INTECH).

[65]  Fengjun Li,et al.  Content-Aware Trust Propagation Toward Online Review Spam Detection , 2019, ACM J. Data Inf. Qual..

[66]  Manisha Sharma,et al.  Spam detection in social media using convolutional and long short term memory neural network , 2018, Annals of Mathematics and Artificial Intelligence.

[67]  Claire Cardie,et al.  Estimating the prevalence of deception in online review communities , 2012, WWW.

[68]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[69]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[70]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[71]  Petr Hájek,et al.  Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns , 2017, Neural Computing and Applications.

[72]  Aliaksandr Barushka,et al.  Review Spam Detection Using Word Embeddings and Deep Neural Networks , 2019, AIAI.

[73]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[74]  Avinash Chandra Pandey,et al.  Spam Detection Using Rating and Review Processing Method , 2018, Smart Innovations in Communication and Computational Sciences.

[75]  Andrew McCarren,et al.  Fact or Factitious? Contextualized Opinion Spam Detection , 2019, ACL.

[76]  Ahmet Onur Durahim,et al.  SPR2EP: A Semi-Supervised Spam Review Detection Framework , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[77]  Raymond Y. K. Lau,et al.  Text mining and probabilistic language modeling for online review spam detecting , 2011 .