COSMOS: Catching Out-of-Context Misinformation with Self-Supervised Learning

Despite the recent attention to DeepFakes, one of the most prevalent ways to mislead audiences on social media is the use of unaltered images in a new but false context. To address these challenges and support fact-checkers, we propose a new method that automatically detects out-of-context image and text pairs. Our key insight is to leverage grounding of image with text to distinguish out-of-context scenarios that cannot be disambiguated with language alone. We propose a self-supervised training strategy where we only need a set of captioned images. At train time, our method learns to selectively align individual objects in an image with textual claims, without explicit supervision. At test time, we check if both captions correspond to same object(s) in the image but are semantically different, which allows us to make fairly accurate out-of-context predictions. Our method achieves 85% out-of-context detection accuracy. To facilitate benchmarking of this task, we create a large-scale dataset of 200K images with 450K textual captions from a variety of news websites, blogs, and social media posts. The dataset and source code is publicly available here1. 1https://shivangi-aneja.github.io/projects/ cosmos/

[1]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[2]  Kyomin Jung,et al.  Prominent Features of Rumor Propagation in Online Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining.

[3]  Preslav Nakov,et al.  Fact-Checking Meets Fauxtography: Verifying Claims About Images , 2019, EMNLP.

[4]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[5]  Siwei Lyu,et al.  Exposing DeepFake Videos By Detecting Face Warping Artifacts , 2018, CVPR Workshops.

[6]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 2: Evidence and Factuality , 2019, CLEF.

[7]  M. Nießner,et al.  ID-Reveal: Identity-aware DeepFake Video Detection , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Isabelle Augenstein,et al.  Multi-Hop Fact Checking of Political Claims , 2020, IJCAI.

[9]  Lanyu Shang,et al.  FauxWard: a graph neural network approach to fauxtography detection using social media comments , 2020, Social Network Analysis and Mining.

[10]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[12]  Xiaomo Liu,et al.  Real-time Rumor Debunking on Twitter , 2015, CIKM.

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[15]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Joan Donovan,et al.  Deepfakes and cheap fakes , 2019 .

[17]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[18]  Kate Saenko,et al.  Detecting Cross-Modal Inconsistency to Defend against Neural Fake News , 2020, EMNLP.

[19]  Andreas Rössler,et al.  ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection , 2018, ArXiv.

[20]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Wei Gao,et al.  Detect Rumor and Stance Jointly by Neural Multi-task Learning , 2018, WWW.

[23]  Luisa Verdoliva,et al.  Media Forensics and DeepFakes: An Overview , 2020, IEEE Journal of Selected Topics in Signal Processing.

[24]  Preslav Nakov,et al.  It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction , 2019, RANLP.

[25]  Daiheng Gao,et al.  DeepFaceLab: A simple, flexible and extensible face swapping framework , 2020, ArXiv.

[26]  Junichi Yamagishi,et al.  Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Baining Guo,et al.  Face X-Ray for More General Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Preslav Nakov,et al.  Automatic Fact-Checking Using Context and Discourse Information , 2019, ACM J. Data Inf. Qual..

[30]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[31]  Hao Li,et al.  Protecting World Leaders Against Deep Fakes , 2019, CVPR Workshops.

[32]  Ke Li,et al.  FauxBuster: A Content-free Fauxtography Detector Using Social Media Comments , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[33]  Jong-Seok Lee,et al.  EmbraceNet: A robust deep learning architecture for multimodal classification , 2019, Inf. Fusion.

[34]  Matthias Niessner,et al.  Generalized Zero and Few-Shot Transfer for Facial Forgery Detection , 2020, ArXiv.

[35]  Licheng Yu,et al.  Modeling Context in Referring Expressions , 2016, ECCV.

[36]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[37]  Nan Hua,et al.  Universal Sentence Encoder for English , 2018, EMNLP.

[38]  Vasudeva Varma,et al.  MVAE: Multimodal Variational Autoencoder for Fake News Detection , 2019, WWW.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Jakob Grue Simonsen,et al.  Generating Fact Checking Explanations , 2020, ACL.

[41]  Zhe Zhao,et al.  Spotting Icebergs by the Tips: Rumor and Persuasion Campaign Detection in Social Media , 2017 .

[42]  Yongdong Zhang,et al.  Multimodal Fusion with Recurrent Neural Networks for Rumor Detection on Microblogs , 2017, ACM Multimedia.

[43]  C.-C. Jay Kuo,et al.  SBERT-WK: A Sentence Embedding Method by Dissecting BERT-Based Word Models , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[44]  Wei Gao,et al.  Rumor Detection on Twitter with Tree-structured Recursive Neural Networks , 2018, ACL.

[45]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[46]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[47]  Fenglong Ma,et al.  EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection , 2018, KDD.