Detecting Deepfakes with Metric Learning

With the arrival of several face-swapping applications such as FaceApp, SnapChat, MixBooth, FaceBlender and many more, the authenticity of digital media content is hanging on a very loose thread. On social media platforms, videos are widely circulated often at a high compression factor. In this work, we analyze several deep learning approaches in the context of deepfakes classification in high compression scenarios and demonstrate that a proposed approach based on metric learning can be very effective in performing such a classification. Using less number of frames per video to assess its realism, the metric learning approach using a triplet network architecture proves to be fruitful. It learns to enhance the feature space distance between the cluster of real and fake videos embedding vectors. We validated our approaches on two datasets to analyze the behavior in different environments. We achieved a state-of-the-art AUC score of 99.2% on the Celeb-DF dataset and accuracy of 90.71% on a highly compressed Neural Texture dataset. Our approach is especially helpful on social media platforms where data compression is inevitable.

[1]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Junichi Yamagishi,et al.  Distinguishing computer graphics from natural images using convolution neural networks , 2017, 2017 IEEE Workshop on Information Forensics and Security (WIFS).

[3]  Belhassen Bayar,et al.  A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer , 2016, IH&MMSec.

[4]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Justus Thies,et al.  Deferred neural rendering , 2019, ACM Trans. Graph..

[7]  Timothy Dozat,et al.  Incorporating Nesterov Momentum into Adam , 2016 .

[8]  Tal Hassner,et al.  FSGAN: Subject Agnostic Face Swapping and Reenactment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[10]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[11]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[12]  Saeid Nahavandi,et al.  Deep Learning for Deepfakes Creation and Detection , 2019, ArXiv.

[13]  Luisa Verdoliva,et al.  Media Forensics and DeepFakes: An Overview , 2020, IEEE Journal of Selected Topics in Signal Processing.

[14]  Davide Cozzolino,et al.  Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection , 2017, IH&MMSec.

[15]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[16]  Honggang Qi,et al.  Celeb-DF: A New Dataset for DeepFake Forensics , 2019, ArXiv.

[17]  Aythami Morales,et al.  DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection , 2020, Inf. Fusion.