论文信息 - A Textual-Visual-Entailment-based Unsupervised Algorithm for Cheapfake Detection

A Textual-Visual-Entailment-based Unsupervised Algorithm for Cheapfake Detection

The growth of communication has led to misinformation in many different forms. "Cheapfake" is a recently coined term referring to manipulated media generated by non-AI techniques. One of the most prevalent ways to create cheapfakes is by simply altering the context of an image/video using a misleading caption. The "ACMMM 2022 Grand Challenge on Detecting Cheapfakes" has raised the problem of catching the out-of-context misuse to assist fact-checkers, as detecting conflicting image-caption sets helps narrow the search space. To cope with this challenge, we propose a multimodal heuristic method. The proposed method expands the baseline method of the challenge (i.e., COSMOS) with four additional components (i.e., Natural Language Inference, Fabricated Claims Detection, Visual Entailment, and Online Caption Checking) towards overcoming the current weaknesses of the baseline. During runtime, our proposed method has achieved a maximum of 89.1% accuracy on Task 1, which is 7.2% higher than the baseline method, and 73% accuracy on Task 2. The code for our solution is publicly available on Github1 https://github.com/pwnyniche/acmmmcheapfake2022 and the Docker image can be found on DockerHub2 https://hub.docker.com/repository/docker/tqtnk2000/acmmmcheapfakes.

[1] Jingren Zhou,et al. OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework , 2022, ICML.

[2] Duc-Tien Dang-Nguyen,et al. MMSys'21 Grand Challenge on Detecting Cheapfakes , 2021, ArXiv.

[3] A. Begen,et al. COSMOS on Steroids: a Cheap Detector for Cheapfakes , 2021, Proceedings of the 12th ACM Multimedia Systems Conference.

[4] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[5] Jianfeng Gao,et al. DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[6] Luísa Coheur,et al. To BERT or Not to BERT Dealing with Possible BERT Failures in an Entailment Task , 2020, IPMU.

[7] Asim Kadav,et al. Visual Entailment: A Novel Task for Fine-Grained Image Understanding , 2019, ArXiv.

[8] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[9] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[10] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.