A benchmark study of machine learning models for online fake news detection

Abstract The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focused on a specific type of news (such as political) which leads us to the question of dataset-bias of the models used. In this research, we conducted a benchmark study to assess the performance of different applicable machine learning approaches on three different datasets where we accumulated the largest and most diversified one. We explored a number of advanced pre-trained language models for fake news detection along with the traditional and deep learning ones and compared their performances from different aspects for the first time to the best of our knowledge. We find that BERT and similar pre-trained models perform the best for fake news detection, especially with very small dataset. Hence, these models are significantly better option for languages with limited electronic contents, i.e., training data. We also carried out several analysis based on the models’ performance, article’s topic, article’s length, and discussed different lessons learned from them. We believe that this benchmark study will help the research community to explore further and news sites/blogs to select the most appropriate fake news detection method.

[1]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[2]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[3]  Vasudeva Varma,et al.  MVAE: Multimodal Variational Autoencoder for Fake News Detection , 2019, WWW.

[4]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[5]  Lidong Bing,et al.  Exploiting BERT for End-to-End Aspect-based Sentiment Analysis , 2019, EMNLP.

[6]  Athena Vakali,et al.  Behind the cues: A benchmarking study for fake news detection , 2019, Expert Syst. Appl..

[7]  Lorenzo Rosasco,et al.  Are Loss Functions All the Same? , 2004, Neural Computation.

[8]  Shrisha Rao,et al.  3HAN: A Deep Neural Network for Fake News Detection , 2017, ICONIP.

[9]  Andreas Vlachos,et al.  Fake news stance detection using stacked ensemble of classifiers , 2017, NLPmJ@EMNLP.

[10]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[11]  Michael S. Bernstein,et al.  Empath: Understanding Topic Signals in Large-Scale Text , 2016, CHI.

[12]  Johan Hovold,et al.  Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds , 2005, CEAS.

[13]  Hazem Hajj,et al.  AraBERT: Transformer-based Model for Arabic Language Understanding , 2020, OSACT.

[14]  Yimin Chen,et al.  Misleading Online Content: Recognizing Clickbait as "False News" , 2015, WMDD@ICMI.

[15]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[16]  Mykhailo Granik,et al.  Fake news detection using naive Bayes classifier , 2017, 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON).

[17]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[18]  Huan Liu,et al.  Tracing Fake-News Footprints: Characterizing Social Media Messages by How They Propagate , 2018, WSDM.

[19]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[20]  Tommaso Caselli,et al.  BERTje: A Dutch BERT Model , 2019, ArXiv.

[21]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[22]  Yang Liu,et al.  Fine-tune BERT for Extractive Summarization , 2019, ArXiv.

[23]  Upmanu Lall,et al.  A Nearest Neighbor Bootstrap For Resampling Hydrologic Time Series , 1996 .

[24]  Jimmy J. Lin,et al.  DocBERT: BERT for Document Classification , 2019, ArXiv.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Yimin Chen,et al.  Deception detection for news: Three types of fakes , 2015, ASIST.

[27]  Ibrahim Bounhas,et al.  A Hybrid Approach for Fake News Detection in Twitter Based on User Features and Graph Embedding , 2020, ICDCIT.

[28]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[29]  Johannes Fürnkranz,et al.  A Study Using $n$-gram Features for Text Categorization , 1998 .

[30]  Jianfeng Gao,et al.  Deep Learning Based Text Classification: A Comprehensive Review , 2020, ArXiv.

[31]  Eugenio Tacchini,et al.  Some Like it Hoax: Automated Fake News Detection in Social Networks , 2017, ArXiv.

[32]  Yimin Chen,et al.  Automatic deception detection: Methods for finding fake news , 2015, ASIST.

[33]  Huan Liu,et al.  dEFEND: Explainable Fake News Detection , 2019, KDD.

[34]  Shlok Gilda,et al.  Evaluating machine learning algorithms for fake news detection , 2017, 2017 IEEE 15th Student Conference on Research and Development (SCOReD).

[35]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[36]  Georg Rehm,et al.  From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles , 2017, NLPmJ@EMNLP.

[37]  Sunil B. Wankhade,et al.  Survey on Fake News Detection Techniques , 2020, ICIP 2020.

[38]  Victoria L. Rubin,et al.  Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News , 2016 .

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Giovanni Semeraro,et al.  AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets , 2019, CLiC-it.

[41]  Francesco Marcelloni,et al.  A survey on fake news and rumour detection techniques , 2019, Inf. Sci..

[42]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[43]  Issa Traoré,et al.  Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques , 2017, ISDDC.

[44]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[45]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[46]  Jing Qian,et al.  A Survey on Natural Language Processing for Fake News Detection , 2018, LREC.

[47]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[48]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[49]  Reza Zafarani,et al.  Network-based Fake News Detection: A Pattern-driven Approach , 2019, SKDD.

[50]  Zhiyong Lu,et al.  Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets , 2019, BioNLP@ACL.

[51]  Ali A. Ghorbani,et al.  An overview of online fake news: Characterization, detection, and discussion , 2020, Inf. Process. Manag..

[52]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[53]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[54]  Michał Choraś,et al.  Application of the BERT-Based Architecture in Fake News Detection , 2020, CISIS.

[55]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[56]  Hueiseok Lim,et al.  exBAKE: Automatic Fake News Detection Model Based on Bidirectional Encoder Representations from Transformers (BERT) , 2019, Applied Sciences.

[57]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[58]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[59]  Pascale Fung,et al.  Team yeon-zi at SemEval-2019 Task 4: Hyperpartisan News Detection by De-noising Weakly-labeled Data , 2019, SemEval@NAACL-HLT.

[60]  Huan Liu,et al.  Gleaning Wisdom from the Past: Early Detection of Emerging Rumors in Social Media , 2017, SDM.

[61]  Manish Munikar,et al.  Fine-grained Sentiment Classification using BERT , 2019, 2019 Artificial Intelligence for Transforming Business and Society (AITB).

[62]  Suhang Wang,et al.  Ginger Cannot Cure Cancer: Battling Fake Health News with a Comprehensive Data Repository , 2020, ICWSM.

[63]  Eduardo C. Garrido-Merch'an,et al.  Comparing BERT against traditional machine learning text classification , 2020, ArXiv.

[64]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[65]  Dipanjan Das,et al.  BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.