论文信息 - On the use of automatically generated synthetic image datasets for benchmarking face recognition

On the use of automatically generated synthetic image datasets for benchmarking face recognition

The availability of large-scale face datasets has been key in the progress of face recognition. However, due to licensing issues or copyright infringement, some datasets are not available anymore (e.g. MS-Celeb-1M). Recent advances in Generative Adversarial Networks (GANs), to synthesize realistic face images, provide a pathway to replace real datasets by synthetic datasets, both to train and benchmark face recognition (FR) systems. The work presented in this paper provides a study on benchmarking FR systems using a synthetic dataset. First, we introduce the proposed methodology to generate a synthetic dataset, without the need for human intervention, by exploiting the latent structure of a StyleGAN2 model with multiple controlled factors of variation. Then, we confirm that (i) the generated synthetic identities are not data subjects from the GAN’s training dataset, which is verified on a synthetic dataset with 10K+ identities; (ii) benchmarking results on the synthetic dataset are a good substitution, often providing error rates and system ranking similar to the benchmarking on the real dataset.

Tiago de Freitas Pereira | S'ebastien Marcel | Laurent Colbois | S. Marcel | Laurent Colbois

[1] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[2] Anurag Ranjan,et al. GIF: Generative Interpretable Faces , 2020, 2020 International Conference on 3D Vision (3DV).

[3] Sami Romdhani,et al. A 3D GAN for Improved Large-pose Facial Recognition , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Xiao Liu,et al. STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Viktor Varkarakis,et al. Validating Seed Data Samples for Synthetic Identities – Methodology and Uniqueness Metrics , 2020, IEEE Access.

[6] Sébastien Marcel,et al. Heterogeneous Face Recognition Using Domain Specific Units , 2019, IEEE Transactions on Information Forensics and Security.

[7] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[8] Yuxiao Hu,et al. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[9] Shiguang Shan,et al. AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[10] Peter Wonka,et al. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Gang Hua,et al. Towards Open-Set Identity Preserving Face Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Sébastien Marcel,et al. Face Recognition in Challenging Environments: An Experimental and Reproducible Research Survey , 2016, Face Recognition Across the Imaging Spectrum.

[14] Chris Donahue,et al. Semantically Decomposing the Latent Spaces of Generative Adversarial Networks , 2017, ICLR.

[15] Bolei Zhou,et al. Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Omkar M. Parkhi,et al. VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[17] Bernhard Egger,et al. Training Deep Face Recognition Systems with Synthetic Data , 2018, ArXiv.

[18] Jaakko Lehtinen,et al. Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Daniel Cohen-Or,et al. Face identity disentanglement via latent space mapping , 2020, ACM Trans. Graph..

[20] Takeo Kanade,et al. Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[21] Rama Chellappa,et al. ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[22] Sami Romdhani,et al. Taking Control of Intra-class Variation in Conditional GANs Under Weak Supervision , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[23] Chang Liu,et al. Study on Synthetic Face Database for Performance Evaluation , 2006, ICB.

[24] Jean-Luc Dugelay,et al. Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[25] Li Meng,et al. Generating Photo-Realistic Training Data to Improve Face Recognition Accuracy , 2018, Neural Networks.

[26] Anil K. Jain,et al. IARPA Janus Benchmark - C: Face Dataset and Protocol , 2018, 2018 International Conference on Biometrics (ICB).

[27] Daniel E. Crispell,et al. Dataset Augmentation for Pose and Lighting Invariant Face Recognition , 2017, ArXiv.