ID-Reveal: Identity-aware DeepFake Video Detection

A major challenge in DeepFake forgery detection is that state-of-the-art algorithms are mostly trained to detect a specific fake method. As a result, these approaches show poor generalization across different types of facial manipulations, e.g., from face swapping to facial reenactment. To this end, we introduce ID-Reveal, a new approach that learns temporal facial features, specific of how a person moves while talking, by means of metric learning coupled with an adversarial training strategy. The advantage is that we do not need any training data of fakes, but only train on real videos. Moreover, we utilize high-level semantic features, which enables robustness to widespread and disruptive forms of post-processing. We perform a thorough experimental analysis on several publicly available benchmarks. Compared to state of the art, our method improves generalization and is more robust to low-quality videos, that are usually spread over social networks. In particular, we obtain an average improvement of more than 15% in terms of accuracy for facial reenactment on high compressed videos.

[1]  Yu-Gang Jiang,et al.  WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection , 2020, ACM Multimedia.

[2]  Zhen Lei,et al.  Towards Fast, Accurate and Stable 3D Dense Face Alignment , 2020, ECCV.

[3]  Iacopo Masi,et al.  Two-branch Recurrent Network for Isolating Deepfakes in Videos , 2020, ECCV.

[4]  Matthias Niessner,et al.  Generalized Zero and Few-Shot Transfer for Facial Forgery Detection , 2020, ArXiv.

[5]  Lei Ma,et al.  DeepRhythm: Exposing DeepFakes with Attentional Visual Heartbeat Rhythms , 2020, ACM Multimedia.

[6]  Brian Dolhansky,et al.  The DeepFake Detection Challenge Dataset , 2020, ArXiv.

[7]  Maneesh Agrawala,et al.  Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Zhuo Chen,et al.  PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stefanos Zafeiriou,et al.  Head2Head: Video-based Neural Head Synthesis , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[10]  Hans-Peter Seidel,et al.  Videoforensicshq: Detecting High-Quality Manipulated Face Videos , 2020, 2021 IEEE International Conference on Multimedia and Expo (ICME).

[11]  Daiheng Gao,et al.  DeepFaceLab: A simple, flexible and extensible face swapping framework , 2020, ArXiv.

[12]  Ser-Nam Lim,et al.  Detecting Deep-Fake Videos from Appearance and Behavior , 2020, 2020 IEEE International Workshop on Information Forensics and Security (WIFS).

[13]  Jiaolong Yang,et al.  Deep 3D Portrait From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Paolo Bestagini,et al.  Video Face Manipulation Detection Through Ensemble of CNNs , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[15]  Arnav Bhavsar,et al.  Detecting Deepfakes with Metric Learning , 2020, 2020 8th International Workshop on Biometrics and Forensics (IWBF).

[16]  Nicu Sebe,et al.  First Order Motion Model for Image Animation , 2020, NeurIPS.

[17]  L. Verdoliva Media Forensics and DeepFakes: An Overview , 2020, IEEE Journal of Selected Topics in Signal Processing.

[18]  Chen Change Loy,et al.  DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  A. Morales,et al.  DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection , 2020, Inf. Fusion.

[20]  Fang Wen,et al.  Face X-Ray for More General Face Forgery Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alexei A. Efros,et al.  CNN-Generated Images Are Surprisingly Easy to Spot… for Now , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Justus Thies,et al.  Neural Voice Puppetry: Audio-driven Facial Reenactment , 2019, ECCV.

[23]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Cristian Canton-Ferrer,et al.  The Deepfake Detection Challenge (DFDC) Preview Dataset , 2019, ArXiv.

[25]  Anil K. Jain,et al.  On the Detection of Digital Face Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Sumit Kumar Jha,et al.  Predicting Heart Rate Variations of Deepfake Videos using Neural ODE , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[27]  Siwei Lyu,et al.  Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Shiva K. Pentyala,et al.  Towards Generalizable Deepfake Detection with Locality-aware AutoEncoder , 2019, CIKM.

[29]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[30]  V. Lempitsky,et al.  Few-Shot Adversarial Learning of Realistic Neural Talking Head Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Justus Thies,et al.  Deferred neural rendering , 2019, ACM Trans. Graph..

[32]  Matthew R. Scott,et al.  Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Ilke Demir,et al.  FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals , 2019, IEEE transactions on pattern analysis and machine intelligence.

[35]  M. Nießner,et al.  ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection , 2018, ArXiv.

[36]  Siwei Lyu,et al.  In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[37]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Siwei Lyu,et al.  Exposing DeepFake Videos By Detecting Face Warping Artifacts , 2018, CVPR Workshops.

[39]  Edward J. Delp,et al.  Deepfake Video Detection Using Recurrent Neural Networks , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[40]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[41]  Davide Cozzolino,et al.  Noiseprint: A CNN-Based Camera Model Fingerprint , 2018, IEEE Transactions on Information Forensics and Security.

[42]  Andrew Zisserman,et al.  Self-supervised learning of a facial attribute embedding from video , 2018, BMVC.

[43]  Andrew Zisserman,et al.  X2Face: A network for controlling face generation by using images, audio, and pose codes , 2018, ECCV.

[44]  Joon Son Chung,et al.  VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.

[45]  Justus Thies,et al.  Headon , 2018, ACM Trans. Graph..

[46]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..

[47]  Andrew Owens,et al.  Fighting Fake News: Image Splice Detection via Learned Self-Consistency , 2018, ECCV.

[48]  Kaiming He,et al.  Group Normalization , 2018, International Journal of Computer Vision.

[49]  Daniel Cohen-Or,et al.  Bringing portraits to life , 2017, ACM Trans. Graph..

[50]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  James Hays,et al.  Localizing and Orienting Street Views Using Overhead Imagery , 2016, ECCV.

[52]  Justus Thies,et al.  Face2Face: Real-Time Face Capture and Reenactment of RGB Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Patrick Pérez,et al.  VDub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track , 2015, Comput. Graph. Forum.

[54]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[55]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[56]  Patrick Pérez,et al.  Automatic Face Reenactment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[58]  H. Farid,et al.  Protecting World Leaders Against Deep Fakes , 2019, CVPR Workshops.

[59]  Davide Cozzolino,et al.  Extracting camera-based fingerprints for video forensics , 2019, CVPR Workshops.

[60]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.