FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals

The recent proliferation of fake portrait videos poses direct threats on society, law, and privacy [1]. Believing the fake video of a politician, distributing fake pornographic content of celebrities, fabricating impersonated fake videos as evidence in courts are just a few real world consequences of deep fakes. We present a novel approach to detect synthetic content in portrait videos, as a preventive solution for the emerging threat of deep fakes. In other words, we introduce a deep fake detector. We observe that detectors blindly utilizing deep learning are not effective in catching fake content, as generative models produce formidably realistic results. Our key assertion follows that biological signals hidden in portrait videos can be used as an implicit descriptor of authenticity, because they are neither spatially nor temporally preserved in fake content. To prove and exploit this assertion, we first engage several signal transformations for the pairwise separation problem, achieving 99.39% accuracy. Second, we utilize those findings to formulate a generalized classifier for fake content, by analyzing proposed signal transformations and corresponding feature sets. Third, we generate novel signal maps and employ a CNN to improve our traditional classifier for detecting synthetic content. Lastly, we release an "in the wild" dataset of fake portrait videos that we collected as a part of our evaluation process. We evaluate FakeCatcher on several datasets, resulting with 96%, 94.65%, 91.50%, and 91.07% accuracies, on Face Forensics [2], Face Forensics++ [3], CelebDF [4], and on our new Deep Fakes Dataset respectively. In addition, our approach produces a significantly superior detection rate against baselines, and does not depend on the source, generator, or properties of the fake content. We also analyze signals from various facial regions, under image distortions, with varying segment durations, from different generators, against unseen datasets, and under several dimensionality reduction techniques.

[1]  Simon S. Woo,et al.  Detecting Both Machine and Human Created Fake Face Images In the Wild , 2018, MPS@CCS.

[2]  Joseph O'Rourke,et al.  Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[3]  Rajat Subhra Chakraborty,et al.  Discrete Cosine Transform Residual Feature Based Filtering Forgery and Splicing Detection in JPEG Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Jukka Komulainen,et al.  Face Spoofing Detection Using Colour Texture Analysis , 2016, IEEE Transactions on Information Forensics and Security.

[5]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[6]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Bin Li,et al.  Identification of deep network generated images using disparities in color components , 2020, Signal Process..

[9]  Bao-Liang Lu,et al.  EEG-based emotion recognition during watching movies , 2011, 2011 5th International IEEE/EMBS Conference on Neural Engineering.

[10]  Pasin Israsena,et al.  Real-Time EEG-Based Happiness Detection System , 2013, TheScientificWorldJournal.

[11]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  K. Balasamy,et al.  Lyapunov features based EEG signal classification by multi-class SVM , 2011, 2011 World Congress on Information and Communication Technologies.

[13]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Sébastien Marcel,et al.  DeepFakes: a New Threat to Face Recognition? Assessment and Detection , 2018, ArXiv.

[15]  Zhengguo Li,et al.  A Novel Framework for Remote Photoplethysmography Pulse Extraction on Compressed Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  George Manis,et al.  Heartbeat Time Series Classification With Support Vector Machines , 2009, IEEE Transactions on Information Technology in Biomedicine.

[17]  Weifeng Zhang Automatic modulation classification based on statistical features and Support Vector Machine , 2014, 2014 XXXIth URSI General Assembly and Scientific Symposium (URSI GASS).

[18]  Mohammad Soleymani,et al.  A Multimodal Database for Affect Recognition and Implicit Tagging , 2012, IEEE Transactions on Affective Computing.

[19]  Junde Song,et al.  Signal Classification Based on Spectral Correlation Analysis and SVM in Cognitive Radio , 2008, 22nd International Conference on Advanced Information Networking and Applications (aina 2008).

[20]  Sébastien Marcel,et al.  Face Anti-spoofing Based on General Image Quality Assessment , 2014, 2014 22nd International Conference on Pattern Recognition.

[21]  Shiguang Shan,et al.  Arbitrary Facial Attribute Editing: Only Change What You Want , 2017, ArXiv.

[22]  Gert Vegter,et al.  In handbook of discrete and computational geometry , 1997 .

[23]  Yong Yu,et al.  Face Transfer with Generative Adversarial Network , 2017, ArXiv.

[24]  Gerard de Haan,et al.  Robust Pulse Rate From Chrominance-Based rPPG , 2013, IEEE Transactions on Biomedical Engineering.

[25]  Paolo Bestagini,et al.  Aligned and Non-Aligned Double JPEG Detection Using Convolutional Neural Networks , 2017, J. Vis. Commun. Image Represent..

[26]  Michael S. Lazar,et al.  Spatial patterns underlying population differences in the background EEG , 2005, Brain Topography.

[27]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Kiran B. Raja,et al.  Fake Face Detection Methods: Can They Be Generalized? , 2018, 2018 International Conference of the Biometrics Special Interest Group (BIOSIG).

[29]  Frédo Durand,et al.  Eulerian video magnification for revealing subtle changes in the world , 2012, ACM Trans. Graph..

[30]  Ying Zhang,et al.  Automated face swapping and its detection , 2017, 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP).

[31]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[32]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[33]  Farinaz Koushanfar,et al.  Heart-to-heart (H2H): authentication for implanted medical devices , 2013, CCS.

[34]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[35]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[36]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Justus Thies,et al.  Deferred neural rendering , 2019, ACM Trans. Graph..

[39]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[40]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[41]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.

[42]  Hao Li,et al.  Protecting World Leaders Against Deep Fakes , 2019, CVPR Workshops.

[43]  Soo-Chang Pei,et al.  Relations Between Gabor Transforms and Fractional Fourier Transforms and Their Applications for Signal Processing , 2006, IEEE Transactions on Signal Processing.

[44]  Jean-Marc Odobez,et al.  Learning Multimodal Temporal Representation for Dubbing Detection in Broadcast Media , 2016, ACM Multimedia.

[45]  Junichi Yamagishi,et al.  Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  Andrew Brock,et al.  Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[47]  Hao Li,et al.  paGAN: real-time avatars using dynamic textures , 2019, ACM Trans. Graph..

[48]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Sébastien Marcel,et al.  Speaker Inconsistency Detection in Tampered Video , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[50]  Siwei Lyu,et al.  Exposing DeepFake Videos By Detecting Face Warping Artifacts , 2018, CVPR Workshops.

[51]  Raymond Chiong,et al.  Remote heart rate measurement using low-cost RGB face video: a technical literature review , 2018, Frontiers of Computer Science.

[52]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[53]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[54]  Joon Son Chung,et al.  Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[56]  Daniel McDuff,et al.  DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks , 2018, ECCV.

[57]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[58]  Rosalind W. Picard,et al.  Non-contact, automated cardiac pulse measurements using video imaging and blind source separation , 2022 .

[59]  Frédo Durand,et al.  Detecting Pulse from Head Motions in Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..

[61]  Ainuddin Wahid Abdul Wahab,et al.  Copy-move forgery detection: Survey, challenges and future directions , 2016, J. Netw. Comput. Appl..

[62]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[63]  Conrad S. Tucker,et al.  Bounded Kalman filter method for motion-robust, non-contact heart rate estimation. , 2018, Biomedical optics express.

[64]  Rama Chellappa,et al.  Disguised Faces in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[65]  Wei Shen,et al.  Learning Residual Images for Face Attribute Manipulation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Pascal Müller,et al.  On Realism of Architectural Procedural Models , 2017, Comput. Graph. Forum.

[67]  Siwei Lyu,et al.  In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking , 2018, ArXiv.

[68]  Jukka Komulainen,et al.  Audiovisual synchrony assessment for replay attack detection in talking face biometrics , 2015, Multimedia Tools and Applications.

[69]  David L. Donoho,et al.  WaveLab and Reproducible Research , 1995 .

[70]  Edward J. Delp,et al.  Deepfake Video Detection Using Recurrent Neural Networks , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[71]  Francesc Moreno-Noguer,et al.  GANimation: Anatomically-aware Facial Animation from a Single Image , 2018, ECCV.

[72]  Christian Riess,et al.  Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations , 2019, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW).

[73]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[74]  P. Welch The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .

[75]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[76]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[77]  Yang Liu,et al.  Physics-Based Generative Adversarial Models for Image Restoration and Beyond , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Lai-Man Po,et al.  Motion-Resistant Remote Imaging Photoplethysmography Based on the Optical Properties of Skin , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[79]  Nicu Sebe,et al.  Self-Adaptive Matrix Completion for Heart Rate Estimation from Face Videos under Realistic Conditions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[82]  Kim L. Boyer,et al.  Precision range image registration using a robust surface interpenetration measure and enhanced genetic algorithms , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Lorenzo Scalise,et al.  Heart rate measurement in neonatal patients using a webcamera , 2012, 2012 IEEE International Symposium on Medical Measurements and Applications Proceedings.

[84]  Gang Hua,et al.  Towards Open-Set Identity Preserving Face Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[85]  Jan Kautz,et al.  Video-to-Video Synthesis , 2018, NeurIPS.

[86]  Andreas Rössler,et al.  FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces , 2018, ArXiv.

[87]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[88]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[89]  Bin Li,et al.  Detection of Deep Network Generated Images Using Disparities in Color Components , 2018, ArXiv.