Adversarial Disentanglement Spectrum Variations and Cross-Modality Attention Networks for NIR-VIS Face Recognition

Near-infrared and visual (NIR-VIS) matching task refers to the face recognition between the two images of different modalities, which remains a challenging task in the field of machine vision. The main problems of NIR-VIS Heterogeneous Face Recognition (HFR) tasks include two aspects: large intra-class differences caused by cross-modal data, and insufficient paired training samples. In this paper, an effective Adversarial Disentanglement spectrum variations and Cross-modality Attention Networks (ADCANs) is proposed for VIS-NIR matching task. Three key components are introduced to the ADCANs for reducing the gap of cross-modal images: Advanced Scatter Loss (ASL), Modality-adversarial Feature Learning (MaFL) and Cross-modality Attention Block (CmAB). The proposed ASL loss captures between- and within-class information of the data and embeds them to the network for more effective training, and it focuses on categories with small between-class distance and increases the distance between them. The MaFL consists of an Identity-Discriminative Feature Learning Network (IDFLN) and a Modality-Adversarial Disentanglement Network (MADN), which can enhance the identity-discriminative feature representations as well as disentangling spectrum variations via an adversarial learning. The IDFLN built by an end-to-end CNNs aims at learning identity-discriminative feature. While the MADN built by a discriminator $D$ and a generator $G$ focuses on removing modality-related information. Furthermore, to increase representation power as well as disentangling spectrum variations effectively, a CmAB block is introduced to the network, which sequentially applies spatial and channel attention modules to both the IDFLN and MADN. Since the channel attention module focuses on ‘what’ features to suppress or emphasize, an orthogonality constraint is introduced to the two channel attention modules, which allows MADN and IDFLN to focus on learning modality-related features and identity-related features, respectively. In particular, the ADCANs consists of multiple CmAB blocks to learn discriminative features and disentangle spectrum variations. A large number of experiments on three challenging HFR datasets indicate that the proposed ADCANs is effective for VIS-NIR HFR task.

[1]  Tieniu Tan,et al.  Coupled Deep Learning for Heterogeneous Face Recognition , 2017, AAAI.

[2]  Xinbo Gao,et al.  Dual-Transfer Face Sketch–Photo Synthesis , 2019, IEEE Transactions on Image Processing.

[3]  Tieniu Tan,et al.  Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  D. Jacobs,et al.  Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.

[5]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Stan Z. Li,et al.  Shared representation learning for heterogenous face recognition , 2014, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[7]  Guillermo Sapiro,et al.  Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Haifeng Hu,et al.  Heterogeneous Face Recognition Based on Multiple Deep Networks With Scatter Loss and Diversity Combination , 2019, IEEE Access.

[9]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[10]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[12]  Chu-Song Chen,et al.  Face Recognition and Retrieval Using Cross-Age Reference Coding With Cross-Age Celebrity Dataset , 2015, IEEE Transactions on Multimedia.

[13]  Tong Zhang,et al.  A Deep Neural Network-Driven Feature Learning Method for Multi-view Facial Expression Recognition , 2016, IEEE Transactions on Multimedia.

[14]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[15]  Stan Z. Li,et al.  Coupled Spectral Regression for matching heterogeneous faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Man Zhang,et al.  Adversarial Discriminative Heterogeneous Face Recognition , 2017, AAAI.

[17]  Vishal M. Patel,et al.  Generative adversarial network-based synthesis of visible faces from polarimetrie thermal faces , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[18]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[19]  Jian Cheng,et al.  Additive Margin Softmax for Face Verification , 2018, IEEE Signal Processing Letters.

[20]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[21]  Ming Shao,et al.  Cross-Modality Feature Learning Through Generic Hierarchical Hyperlingual-Words , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Xiao Wang,et al.  Regularized Discriminative Spectral Regression Method for Heterogeneous Face Matching , 2013, IEEE Transactions on Image Processing.

[23]  Jakob Verbeek,et al.  Heterogeneous Face Recognition with CNNs , 2016, ECCV Workshops.

[24]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Dong Yi,et al.  Face Matching Between Near Infrared and Visible Light Images , 2007, ICB.

[26]  Yinghuan Shi,et al.  Heterogeneous Face Recognition by Margin-Based Cross-Modality Metric Learning , 2018, IEEE Transactions on Cybernetics.

[27]  Wei Wang,et al.  Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Haifeng Hu,et al.  Discriminant Deep Feature Learning based on joint supervision Loss and Multi-layer Feature Fusion for heterogeneous face recognition , 2019, Comput. Vis. Image Underst..

[29]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[30]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[31]  Jiwen Lu,et al.  Simultaneous Local Binary Feature Learning and Encoding for Homogeneous and Heterogeneous Face Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[33]  Xuelong Li,et al.  Heterogeneous Face Recognition: A Common Encoding Feature Discriminant Approach , 2017, IEEE Transactions on Image Processing.

[34]  Rama Chellappa,et al.  Seeing the Forest from the Trees: A Holistic Approach to Near-Infrared Heterogeneous Face Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Ran He,et al.  Face shape recovery from a single image using CCA mapping between tensor spaces , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Anil K. Jain,et al.  Heterogeneous Face Recognition Using Kernel Prototype Similarities , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[38]  Haifeng Hu,et al.  Disentangled Spectrum Variations Networks for NIR–VIS Face Recognition , 2020, IEEE Transactions on Multimedia.

[39]  James Zijun Wang,et al.  Rating Image Aesthetics Using Deep Learning , 2015, IEEE Transactions on Multimedia.

[40]  Yu Qiao,et al.  Residual Compensation Networks for Heterogeneous Face Recognition , 2019, AAAI.

[41]  Prudhvi Gurram,et al.  A Polarimetric Thermal Database for Face Recognition Research , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Zhenan Sun,et al.  A Lightened CNN for Deep Face Representation , 2015, ArXiv.

[44]  Dacheng Tao,et al.  Robust Face Recognition via Multimodal Deep Face Representation , 2015, IEEE Transactions on Multimedia.

[45]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[47]  Tieniu Tan,et al.  Learning Invariant Deep Representation for NIR-VIS Face Recognition , 2017, AAAI.

[48]  Zhenan Sun,et al.  Disentangled Variational Representation for Heterogeneous Face Recognition , 2018, AAAI.

[49]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[50]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[51]  Ming-Hsuan Yang,et al.  Real-Time Exemplar-Based Face Sketch Synthesis , 2014, ECCV.

[52]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[53]  M. Saquib Sarfraz,et al.  Heterogeneous Face Recognition: Recent Advances in Infrared-to-Visible Matching , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[54]  Matti Pietikäinen,et al.  Learning mappings for face synthesis from near infrared to visual light images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Ran He,et al.  Dual Variational Generation for Low-Shot Heterogeneous Face Recognition , 2019, NeurIPS.

[56]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[58]  Stan Z. Li,et al.  An Analysis-by-Synthesis Method for Heterogeneous Face Biometrics , 2009, ICB.

[59]  Marios Savvides,et al.  NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[60]  M. Saquib Sarfraz,et al.  Deep Perceptual Mapping for Cross-Modal Face Recognition , 2016, International Journal of Computer Vision.

[61]  Shengcai Liao,et al.  The CASIA NIR-VIS 2.0 Face Database , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[62]  S. Shan,et al.  VIPLFaceNet: an open source deep face recognition SDK , 2016, Frontiers of Computer Science.

[63]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[64]  Tieniu Tan,et al.  Transferring deep representation for NIR-VIS heterogeneous face recognition , 2016, 2016 International Conference on Biometrics (ICB).

[65]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[66]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[67]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.