Deepfakes: temporal sequential analysis to detect face-swapped video clips using convolutional long short-term memory

Abstract. Deepfake (a bag of “deep learning” and “fake”) is a technique for human image synthesis based on artificial intelligence, i.e., to superimpose the existing (source) images or videos onto destination images or videos using neural networks (NNs). Deepfake enthusiasts have been using NNs to produce convincing face swaps. Deepfakes are a type of video or image forgery developed to spread misinformation, invade privacy, and mask the truth using advanced technologies such as trained algorithms, deep learning applications, and artificial intelligence. They have become a nuisance to social media users by publishing fake videos created by fusing a celebrity’s face over an explicit video. The impact of deepfakes is alarming, with politicians, senior corporate officers, and world leaders being targeted by nefarious actors. An approach to detect deepfake videos of politicians using temporal sequential frames is proposed. The proposed approach uses the forged video to extract the frames at the first level followed by a deep depth-based convolutional long short-term memory model to identify the fake frames at the second level. Also the proposed model is evaluated on our newly collected ground truth dataset of forged videos using source and destination video frames of famous politicians. Experimental results demonstrate the effectiveness of our method.

[1]  Hao Li,et al.  Protecting World Leaders Against Deep Fakes , 2019, CVPR Workshops.

[2]  Ponnurangam Kumaraguru,et al.  Automating fake news detection system using multi-level voting model , 2019, Soft Computing.

[3]  Wagner Meira,et al.  Analyzing and characterizing political discussions in WhatsApp public groups , 2018, ArXiv.

[4]  Junichi Yamagishi,et al.  Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Ponnurangam Kumaraguru,et al.  Facebook Inspector (FbI): Towards automatic real-time detection of malicious content on Facebook , 2017, Social Network Analysis and Mining.

[6]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[7]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Siwei Lyu,et al.  In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[9]  Cristian Canton-Ferrer,et al.  The Deepfake Detection Challenge (DFDC) Preview Dataset , 2019, ArXiv.

[10]  Sébastien Marcel,et al.  Vulnerability assessment and detection of Deepfake videos , 2019, 2019 International Conference on Biometrics (ICB).

[11]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[12]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Naphtali Abudarham,et al.  Reverse engineering the face space: Discovering the critical features for face identification. , 2014, Journal of vision.

[15]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[16]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[17]  Edward J. Delp,et al.  Deepfake Video Detection Using Recurrent Neural Networks , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[18]  Richa Singh,et al.  RGB-D Face Recognition With Texture and Attribute Features , 2014, IEEE Transactions on Information Forensics and Security.

[19]  Kevin Robert Canini,et al.  Finding Credible Information Sources in Social Networks Based on Content and Social Structure , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[20]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Virgílio A. F. Almeida,et al.  Detecting Spammers and Content Promoters in Online Video Social Networks , 2009, IEEE INFOCOM Workshops 2009.

[22]  Weihong Deng,et al.  Learning temporal features using LSTM-CNN architecture for face anti-spoofing , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[23]  Simon S. Woo,et al.  Detecting Both Machine and Human Created Fake Face Images In the Wild , 2018, MPS@CCS.

[24]  Shivakant Mishra,et al.  Prediction of Cyberbullying Incidents on the Instagram Social Network , 2015, ArXiv.

[25]  Lucas Theis,et al.  Fast Face-Swap Using Convolutional Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[27]  Junichi Yamagishi,et al.  Use of a Capsule Network to Detect Fake Images and Videos , 2019, ArXiv.

[28]  Belhassen Bayar,et al.  A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer , 2016, IH&MMSec.

[29]  Honggang Qi,et al.  Celeb-DF: A New Dataset for DeepFake Forensics , 2019, ArXiv.

[30]  Sung Wook Baik,et al.  Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features , 2018, IEEE Access.

[31]  Andreas Rössler,et al.  FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces , 2018, ArXiv.