Exploring Generalizability of Fine-Tuned Models for Fake News Detection

The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an ‘infodemic’ by the CDC and WHO. Misinformation tied to the Covid-19 infodemic changes continuously; this can lead to performance degradation of fine-tuned models due to concept drift. Degredation can be mitigated if models generalize well-enough to capture some cyclical aspects of drifted data. In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets. We show that existing models often overfit on their training dataset and have poor performance on unseen data. However, on some subsets of unseen data that overlap with training data, models have higher accuracy. Based on this observation, we also present KMeans-Proxy, a fast and effective method based on K-Means clustering for quickly identifying these overlapping subsets of unseen data. KMeans-Proxy improves generalizability on unseen fake news datasets by 0.1-0.2 f1-points across datasets. We present both our generalizability experiments as well as KMeans-Proxy to further research in tackling the fake news problem.

[1]  Rahee Walambe,et al.  Explainable Misinformation Detection Across Multiple Social Media Platforms , 2022, IEEE Access.

[2]  Marcel Salathé,et al.  COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter , 2020, Frontiers in Artificial Intelligence.

[3]  Daniel Y. Fu,et al.  Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision , 2022, UAI.

[4]  S. Dasgupta,et al.  Convergence of online k-means , 2022, AISTATS.

[5]  Jan Philip Wahle,et al.  Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection , 2021, iConference.

[6]  Sourya Dipta Das,et al.  A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles , 2021, Neurocomputing.

[7]  Christopher Ré,et al.  Data Management Opportunities for Foundation Models , 2022, CIDR.

[8]  Nima Kordzadeh,et al.  Multi-Source Domain Adaptation with Weak Supervision for Early Fake News Detection , 2021, 2021 IEEE International Conference on Big Data (Big Data).

[9]  Michael S. Bernstein,et al.  On the Opportunities and Risks of Foundation Models , 2021, ArXiv.

[10]  Wannes Meert,et al.  Machine Learning with a Reject Option: A survey , 2021, ArXiv.

[11]  Artur Dubrawski,et al.  End-to-End Weak Supervision , 2021, NeurIPS.

[12]  Ziyi Kou,et al.  FakeSens: A Social Sensing Approach to COVID-19 Misinformation Detection on Social Media , 2021, 2021 17th International Conference on Distributed Computing in Sensor Systems (DCOSS).

[13]  Wenshuo Wang,et al.  A COVID-19 Rumor Dataset , 2021, Frontiers in Psychology.

[14]  Sanda M. Harabagiu,et al.  Misinformation Adoption or Rejection in the Era of COVID-19 , 2021, ICWSM.

[15]  Fang Wang,et al.  LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation , 2021, ArXiv.

[16]  Abhishek Koirala COVID-19 Fake News Dataset , 2021 .

[17]  B. Chen,et al.  Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection , 2021, CONSTRAINT@AAAI.

[18]  Etsuko Ishii,et al.  Model Generalization on COVID-19 Fake News Detection , 2021, CONSTRAINT@AAAI.

[19]  Rohit Kumar Kaliyar,et al.  MCNNet: Generalizing Fake News Detection with a Multichannel Convolutional Neural Network using a Novel COVID-19 Dataset , 2020, COMAD/CODS.

[20]  E. Quinn,et al.  The Instagram Infodemic: Cobranding of Conspiracy Theories, Coronavirus Disease 2019 and Authority-Questioning Beliefs , 2020, Cyberpsychology Behav. Soc. Netw..

[21]  Casey A. Klofstad,et al.  The different forms of COVID-19 misinformation and their consequences , 2020, Harvard Kennedy School Misinformation Review.

[22]  Toktam A. Oghaz,et al.  A stance data set on polarized conversations on Twitter about the efficacy of hydroxychloroquine as a treatment for COVID-19 , 2020, Data in Brief.

[23]  Sameer Singh,et al.  COVIDLies: Detecting COVID-19 Misinformation on Social Media , 2020, NLP4COVID@EMNLP.

[24]  Kathleen M. Carley,et al.  Characterizing COVID-19 Misinformation Communities Using a Novel Twitter Dataset , 2020, CIKM.

[25]  Calton Pu,et al.  ODIN , 2020, Proc. VLDB Endow..

[26]  Orestis Papakyriakopoulos,et al.  NLP-based Feature Extraction for the Detection of COVID-19 Misinformation Videos on YouTube , 2020, NLPCOVID19.

[27]  Limeng Cui,et al.  CoAID: COVID-19 Healthcare Misinformation Dataset , 2020, ArXiv.

[28]  Graham W. Taylor,et al.  ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis , 2020, ECCV.

[29]  Elahe Rahimtoroghi,et al.  What Happens To BERT Embeddings During Fine-tuning? , 2020, BLACKBOXNLP.

[30]  Calton Pu,et al.  Beyond Artificial Reality , 2020, ACM Trans. Internet Techn..

[31]  Eric Hulburd,et al.  Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0 , 2020, ArXiv.

[32]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[33]  Jian Ni,et al.  Towards Lingua Franca Named Entity Recognition with BERT , 2019, ArXiv.

[34]  Jimmy J. Lin,et al.  What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning , 2019, ArXiv.

[35]  Calton Pu,et al.  Concept Drift Adaptive Physical Event Detection for Social Media Streams , 2019, SERVICES.

[36]  Ran El-Yaniv,et al.  SelectiveNet: A Deep Neural Network with an Integrated Reject Option , 2019, ICML.

[37]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Wouter M. Kouw An introduction to domain adaptation and transfer learning , 2018, ArXiv.

[40]  Barbara Hammer,et al.  Interpretable machine learning with reject option , 2018, Autom..

[41]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  D. Warren,et al.  Analysing patterns of spatial and niche overlap among species at multiple resolutions , 2016 .

[43]  Mauricio Santillana,et al.  Accurate estimation of influenza epidemics using Google search data via ARGO , 2015, Proceedings of the National Academy of Sciences.

[44]  Alessandro Vespignani,et al.  Google Flu Trends Still Appears Sick: An Evaluation of the 2013-2014 Flu Season , 2014 .

[45]  A. Bifet,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[46]  Ruth Urner,et al.  Probabilistic Lipschitzness A niceness assumption for deterministic labels , 2013 .

[47]  Indre Zliobaite,et al.  Learning under Concept Drift: an Overview , 2010, ArXiv.