Creating and detecting fake reviews of online products

Abstract Customers increasingly rely on reviews for product information. However, the usefulness of online reviews is impeded by fake reviews that give an untruthful picture of product quality. Therefore, detection of fake reviews is needed. Unfortunately, so far, automatic detection has only had partial success in this challenging task. In this research, we address the creation and detection of fake reviews. First, we experiment with two language models, ULMFiT and GPT-2, to generate fake product reviews based on an Amazon e-commerce dataset. Using the better model, GPT-2, we create a dataset for a classification task of fake review detection. We show that a machine classifier can accomplish this goal near-perfectly, whereas human raters exhibit significantly lower accuracy and agreement than the tested algorithms. The model was also effective on detected human generated fake reviews. The results imply that, while fake review detection is challenging for humans, “machines can fight machines” in the task of detecting fake reviews. Our findings have implications for consumer protection, defense of firms from unfair competition, and responsibility of review platforms.

[1]  Taghi M. Khoshgoftaar,et al.  Survey of review spam detection using machine learning techniques , 2015, Journal of Big Data.

[2]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[3]  Jari Salo,et al.  The dark side of social media - and Fifty Shades of Grey introduction to the special issue: the dark side of social media , 2018, Internet Res..

[4]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[5]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[6]  Elvira Ismagilova,et al.  Setting the future of digital and social media marketing research: Perspectives and research propositions , 2020, Int. J. Inf. Manag..

[7]  Keith Kirkpatrick,et al.  Battling algorithmic bias , 2016, Commun. ACM.

[8]  Tim C. Kietzmann,et al.  Deepfakes: Trick or treat? , 2020, Business Horizons.

[9]  Ying Ju,et al.  Finding the Best Classification Threshold in Imbalanced Classification , 2016, Big Data Res..

[10]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[11]  Nitesh V. Chawla,et al.  Reliable fake review detection via modeling temporal and behavioral patterns , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[12]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[13]  Reamer L. Bushardt,et al.  "When a Measure Becomes a Target, It Ceases to be a Good Measure". , 2021, Journal of graduate medical education.

[14]  Kyung Hyan Yoo,et al.  Comparison of Deceptive and Truthful Travel Reviews , 2009, ENTER.

[15]  A. Rathinavelu,et al.  Analyzing cloud based reviews for product ranking using feature based clustering algorithm , 2019, Cluster Computing.

[16]  A. Tversky,et al.  On the psychology of prediction , 1973 .

[17]  Jianmo Ni,et al.  Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects , 2019, EMNLP.

[18]  Kenny Q. Zhu,et al.  Automatic Generation of Text Descriptive Comments for Code Blocks , 2018, AAAI.

[19]  B. Jansen,et al.  Optimal advertising for a generalized Vidale–Wolfe response model , 2021, Electronic Commerce Research.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[22]  João Guerreiro,et al.  Unfolding the characteristics of incentivized online reviews , 2019, Journal of Retailing and Consumer Services.

[23]  Bernard J. Jansen,et al.  Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media , 2018, ICWSM.

[24]  Bernard J. Jansen,et al.  Classifying online corporate reputation with machine learning: a study in the banking domain , 2019, Internet Res..

[25]  Raffaele Filieri What makes an online consumer review trustworthy , 2016 .

[26]  S. Shivashankar,et al.  Conceptual level similarity measure based review spam detection , 2010, 2010 International Conference on Signal and Image Processing.

[27]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[28]  N. Perkins,et al.  Optimal Cut-point and Its Corresponding Youden Index to Discriminate Individuals Using Pooled Blood Samples , 2005, Epidemiology.

[29]  Michael Luca Reviews, Reputation, and Revenue: The Case of Yelp.Com , 2016 .

[30]  A. Tversky,et al.  Subjective Probability: A Judgment of Representativeness , 1972 .

[31]  Jo Ann Oravec Artificial Intelligence, Automation, and Social Welfare: Some Ethical and Historical Perspectives on Technological Overstatement and Hyperbole , 2018, Ethics and Social Welfare.

[32]  B. Depaulo,et al.  Lying in everyday life. , 1996, Journal of personality and social psychology.

[33]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[34]  Kolli Shivagangadhar,et al.  Fraud Detection in Online Reviews using Machine Learning Techniques , 2015 .

[35]  Jochen Wirtz,et al.  Artificial intelligence in marketing: Topic modeling, scientometric analysis, and research agenda , 2020, Journal of Business Research.

[36]  Joni Salminen,et al.  Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type , 2019, Journal of Business Research.

[37]  Omar Alonso,et al.  Practical Lessons for Gathering Quality Labels at Scale , 2015, SIGIR.

[38]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[39]  Bastin Tony Roy Savarimuthu,et al.  QuickReview: A Novel Data-Driven Mobile User Interface for Reporting Problematic App Features , 2017, IUI.

[40]  Gabriella Pasi,et al.  Feature Analysis for Fake Review Detection through Supervised Classification , 2017, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[41]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[42]  Anthony J. Robinson,et al.  Language model adaptation using mixtures and an exponentially decaying cache , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[43]  Paulo Duarte,et al.  How convenient is it? Delivering online shopping convenience to enhance customer satisfaction and encourage e-WOM , 2018, Journal of Retailing and Consumer Services.

[44]  Maria Petrescu,et al.  Consumer Initial Trust towards Internet-Only Banks in France? , 2017 .

[45]  Martin Ester,et al.  Detecting Singleton Review Spammers Using Semantic Similarity , 2015, WWW.

[46]  Ivan Vulić,et al.  Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems , 2019, EMNLP.

[47]  Issa Traore,et al.  Detecting opinion spams and fake news using text classification , 2018, Secur. Priv..

[48]  Jessie Pallud,et al.  Illusions of truth—Experimental insights into human and algorithmic detections of fake online reviews , 2020 .

[49]  Andreas Munzel Assisting consumers in detecting fake reviews: The role of identity information disclosure and consensus , 2016 .

[50]  Haidong Li,et al.  Research on Overfitting of Deep Learning , 2019, 2019 15th International Conference on Computational Intelligence and Security (CIS).

[51]  George A. Akerlof The Market for “Lemons”: Quality Uncertainty and the Market Mechanism , 1970 .

[52]  Yogesh Kumar Dwivedi,et al.  Exploring reviews and review sequences on e-commerce platform: A study of helpful reviews on Amazon.in , 2018, Journal of Retailing and Consumer Services.

[53]  JungKun Park,et al.  The investigation on dimensions of e-satisfaction for online shoes retailing , 2012 .

[54]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[55]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[56]  Gabriella Pasi,et al.  Credibility in social media: opinions, news, and health information—a survey , 2017, WIREs Data Mining Knowl. Discov..

[57]  Maria Petrescu,et al.  Incentivized reviews: Promising the moon for a few stars , 2017 .

[58]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[59]  Panagiotis Kanellis,et al.  Trust and relationship building in electronic commerce , 2001, Internet Res..

[60]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[61]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[62]  Luciano Floridi,et al.  GPT-3: Its Nature, Scope, Limits, and Consequences , 2020, Minds and Machines.

[63]  Dov Te'eni,et al.  Past Purchase and Intention to Purchase in E-Commerce: the Mediation of Social Presence and Trust , 2011, Internet Res..

[64]  Yogesh Kumar Dwivedi,et al.  The effect of characteristics of source credibility on consumer behaviour: A meta-analysis , 2020, Journal of Retailing and Consumer Services.

[65]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[66]  Thomas François,et al.  Do NLP and machine learning improve traditional readability formulas? , 2012, PITR@NAACL-HLT.

[67]  Gina A. Tran,et al.  Comparing email and SNS users: Investigating e-servicescape, customer reviews, trust, loyalty and E-WOM , 2020 .

[68]  Tiago A. Almeida,et al.  Towards automatic filtering of fake reviews , 2018, Neurocomputing.

[69]  Davide Proserpio,et al.  The Market for Fake Reviews , 2020, EC.

[70]  L. Floridi Artificial Intelligence, Deepfakes and a Future of Ectypes , 2018, Philosophy & Technology.

[71]  Xingquan Zhu,et al.  iSRD: Spam review detection with imbalanced data distributions , 2014, Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014).

[72]  W. Youden,et al.  Index for rating diagnostic tests , 1950, Cancer.

[73]  R. Olkkonen,et al.  Interactive value formation in interorganizational relationships , 2017 .

[74]  Petr Hájek,et al.  Mining corporate annual reports for intelligent detection of financial statement fraud - A comparative study of machine learning methods , 2017, Knowl. Based Syst..

[75]  Aythami Morales,et al.  DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection , 2020, Inf. Fusion.

[76]  Christopher G. Harris Comparing Human Computation, Machine, and Hybrid Methods for Detecting Hotel Review Spam , 2019, I3E.

[77]  Werner H. Kunz,et al.  Creators, multipliers, and lurkers: who contributes and who benefits at online review sites , 2013 .

[78]  Francisco Javier Miranda,et al.  Fashion brands on retail websites: customer performance expectancy and e-word-of-mouth , 2018 .

[79]  Sung-Hyon Myaeng,et al.  Capturing Word Choice Patterns with LDA for Fake Review Detection in Sentiment Analysis , 2016, WIMS.

[80]  Xifeng Yan,et al.  Synthetic review spamming and defense , 2013, WWW.