OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild

The proliferation of deepfake media is raising concerns among the public and relevant authorities. It has become essential to develop countermeasures against forged faces in social media. This paper presents a comprehensive study on two new countermeasure tasks: multi-face forgery detection and segmentation in-the-wild. Localizing forged faces among multiple human faces in unrestricted natural scenes is far more challenging than the traditional deepfake recognition task. To promote these new tasks, we have created the first large-scale dataset posing a high level of challenges that is designed with face-wise rich annotations explicitly for face forgery detection and segmentation, namely OpenForensics. With its rich annotations, our OpenForensics dataset has great potentials for research in both deepfake prevention and general human face detection. We have also developed a suite of benchmarks for these tasks by conducting an extensive evaluation of state-of-the-art instance detection and segmentation methods on our newly constructed dataset in various scenarios.

[1]  Fernando De la Torre,et al.  Facial Action Transfer with Personalized Bilinear Regression , 2012, ECCV.

[2]  Peter Wonka,et al.  SEAN: Image Synthesis With Semantic Region-Adaptive Normalization , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Trung-Nghia Le,et al.  Semantic Instance Meets Salient Object: Study on Video Semantic Salient Instance Segmentation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[4]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Fang Wen,et al.  FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping , 2019, ArXiv.

[6]  Bolei Zhou,et al.  Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Victor Lempitsky,et al.  Few-Shot Adversarial Learning of Realistic Neural Talking Head Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Sinan Kalkan,et al.  Localization Recall Precision (LRP): A New Performance Metric for Object Detection , 2018, ECCV.

[11]  Tao Kong,et al.  SOLOv2: Dynamic, Faster and Stronger , 2020, ArXiv.

[12]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yu-Gang Jiang,et al.  WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection , 2020, ACM Multimedia.

[15]  Sébastien Marcel,et al.  DeepFakes: a New Threat to Face Recognition? Assessment and Detection , 2018, ArXiv.

[16]  Mohammad Reza Hosseinzadeh Taher,et al.  Learning Semantics-enriched Representation via Self-discovery, Self-classification, and Self-restoration , 2020, MICCAI.

[17]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[18]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[19]  Yong Jae Lee,et al.  YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Alexander C. Berg,et al.  RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free , 2019, ArXiv.

[21]  Brian C. Lovell,et al.  Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference , 2009, ICB.

[22]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Ismail Elezi,et al.  CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Baining Guo,et al.  Face X-Ray for More General Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xin Yang,et al.  Exposing Deep Fakes Using Inconsistent Head Poses , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Jongyoul Park,et al.  CenterMask: Real-Time Anchor-Free Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xiaojuan Qi,et al.  Global Texture Enhancement for Fake Face Detection in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Alan C. Bovik,et al.  Blind/Referenceless Image Spatial Quality Evaluator , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[31]  Trung-Nghia Le,et al.  Attention R-CNN for Accident Detection , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[32]  Junichi Yamagishi,et al.  Multi-task Learning for Detecting and Segmenting Manipulated Facial Images and Videos , 2019, 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[33]  Yuning Jiang,et al.  SOLO: Segmenting Objects by Locations , 2020, ECCV.

[34]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Yongchao Gong,et al.  Mask Scoring R-CNN , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Andreas Rössler,et al.  FaceForensics++: Learning to Detect Manipulated Facial Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..

[38]  Christian Riess,et al.  Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations , 2019, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW).

[39]  Yong Jae Lee,et al.  YOLACT++ Better Real-Time Instance Segmentation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Derek Doran,et al.  What Does Explainable AI Really Mean? A New Conceptualization of Perspectives , 2017, CEx@AI*IA.

[41]  Andrew Zisserman,et al.  X2Face: A network for controlling face generation by using images, audio, and pose codes , 2018, ECCV.

[42]  Hani Hagras,et al.  Toward Human-Understandable, Explainable AI , 2018, Computer.

[43]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Shifeng Zhang,et al.  FaceBoxes: A CPU real-time face detector with high accuracy , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[45]  Gang Hua,et al.  Towards Open-Set Identity Preserving Face Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[48]  Hao Chen,et al.  BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[50]  Irene Kotsia,et al.  RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Chen Qian,et al.  ReenactGAN: Learning to Reenact Faces via Boundary Transfer , 2018, ECCV.

[52]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Justus Thies,et al.  Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .

[54]  Stanislav Pidhorskyi,et al.  Adversarial Latent Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Felix Juefei-Xu,et al.  FakeSpotter: A Simple yet Robust Baseline for Spotting AI-Synthesized Fake Faces , 2019, IJCAI.

[56]  Esa Rahtu,et al.  ICface: Interpretable and Controllable Face Reenactment Using GANs , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[57]  Binh T. Nguyen,et al.  VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations , 2020 .

[58]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Chunhua Shen,et al.  PolarMask: Single Shot Instance Segmentation With Polar Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Iasonas Kokkinos,et al.  Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance , 2018, ECCV.

[62]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Siwei Lyu,et al.  Exposing DeepFake Videos By Detecting Face Warping Artifacts , 2018, CVPR Workshops.

[64]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[65]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[66]  Siwei Lyu,et al.  In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[67]  Jianzhu Guo,et al.  Towards Fast, Accurate and Stable 3D Dense Face Alignment , 2020, ECCV.

[68]  Wallace S. Rutkowski,et al.  TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2022 .

[69]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Daiheng Gao,et al.  DeepFaceLab: A simple, flexible and extensible face swapping framework , 2020, ArXiv.

[71]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[72]  Zhi Tian,et al.  Mask Encoding for Single Shot Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Chen Change Loy,et al.  DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  M. Pantic,et al.  Faces InThe-Wild Challenge : Database and Results , 2016 .

[75]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[76]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[78]  Tal Hassner,et al.  FSGAN: Subject Agnostic Face Swapping and Reenactment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[79]  Chunhua Shen,et al.  Conditional Convolutions for Instance Segmentation , 2020, ECCV.