Dynamic Inconsistency-aware DeepFake Video Detection

The spread of DeepFake videos causes a serious threat to information security, calling for effective detection methods to distinguish them. However, the performance of recent frame-based detection methods become limited due to their ignorance of the inter-frame inconsistency of fake videos. In this paper, we propose a novel Dynamic Inconsistencyaware Network to handle the inconsistent problem, which uses a Cross-Reference module (CRM) to capture both the global and local inter-frame inconsistencies. The CRM contains two parallel branches. The first branch takes faces from adjacent frames as input, and calculates a structure similarity map for a global inconsistency representation. The second branch only focuses on the inter-frame variation of independent critical regions, which captures the local inconsistency. To the best of our knowledge, this is the first work to totally use the inter-frame inconsistency information from the global and local perspectives. Compared with existing methods, our model provides a more accurate and robust detection on FaceForensics++, DFDC-preview and Celeb-DFv2 datasets.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Baining Guo,et al.  Face X-Ray for More General Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[4]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[6]  Siwei Lyu,et al.  In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[7]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Zheng-Jun Zha,et al.  R-Net: A Relationship Network for Efficient and Accurate Scene Text Detection , 2020, IEEE Transactions on Multimedia.

[9]  Andreas Rössler,et al.  ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection , 2018, ArXiv.

[10]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Lu Sheng,et al.  Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues , 2020, ECCV.

[12]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Fernando Pérez-González,et al.  Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security , 2016, IH&MMSec.

[15]  Junichi Yamagishi,et al.  Multi-task Learning for Detecting and Segmenting Manipulated Facial Images and Videos , 2019, 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[16]  FridrichJessica,et al.  Rich Models for Steganalysis of Digital Images , 2012 .

[17]  Junichi Yamagishi,et al.  Distinguishing computer graphics from natural images using convolution neural networks , 2017, 2017 IEEE Workshop on Information Forensics and Security (WIFS).

[18]  Belhassen Bayar,et al.  A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer , 2016, IH&MMSec.

[19]  Kiran B. Raja,et al.  Transferable Deep-CNN Features for Detecting Digital and Print-Scanned Morphed Face Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Irene Kotsia,et al.  RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qiang Ling,et al.  Mining Audio, Text and Visual Information for Talking Face Generation , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[22]  Cristian Canton-Ferrer,et al.  The Deepfake Detection Challenge (DFDC) Preview Dataset , 2019, ArXiv.

[23]  Lingyun Yu,et al.  Multimodal Inputs Driven Talking Face Generation With Spatial–Temporal Dependency , 2021, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Yuxin Wang,et al.  ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yongdong Zhang,et al.  PRRNet: Pixel-Region relation network for face forgery detection , 2021, Pattern Recognition.

[26]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[27]  Justus Thies,et al.  Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .