论文信息 - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation

Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation

Detecting out-of-context media, such as “miscaptioned” images on Twitter, often requires detecting inconsistencies between the two modalities. This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program. First, we collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles. We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives. Our method is then tested on a hidden human-generated evaluation set. We achieve the best result on the program leaderboard, with 11% detection improvement in a high precision regime over a zero-shot CLIP baseline.

Trevor Darrell | Anna Rohrbach | Giscard Biamby | Grace Luo

[1] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[2] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[3] Trevor Darrell,et al. NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media , 2021, EMNLP.