Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition

Heterogeneous face recognition (HFR) aims at matching facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR presents more challenging issues than traditional face recognition because of the large intra-class variation among heterogeneous face images and the limited availability of training samples of cross-modality face image pairs. This paper proposes the novel Wasserstein convolutional neural network (WCNN) approach for learning invariant features between near-infrared (NIR) and visual (VIS) face images (i.e., NIR-VIS face recognition). The low-level layers of the WCNN are trained with widely available face images in the VIS spectrum, and the high-level layer is divided into three parts: the NIR layer, the VIS layer and the NIR-VIS shared layer. The first two layers aim at learning modality-specific features, and the NIR-VIS shared layer is designed to learn a modality-invariant feature subspace. The Wasserstein distance is introduced into the NIR-VIS shared layer to measure the dissimilarity between heterogeneous feature distributions. W-CNN learning is performed to minimize the Wasserstein distance between the NIR distribution and the VIS distribution for invariant deep feature representations of heterogeneous face images. To avoid the over-fitting problem on small-scale heterogeneous face data, a correlation prior is introduced on the fully-connected WCNN layers to reduce the size of the parameter space. This prior is implemented by a low-rank constraint in an end-to-end network. The joint formulation leads to an alternating minimization for deep feature representation at the training stage and an efficient computation for heterogeneous data at the testing stage. Extensive experiments using three challenging NIR-VIS face recognition databases demonstrate the superiority of the WCNN method over state-of-the-art methods.

[1]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Xiaogang Wang,et al.  Face photo recognition using sketch , 2002, Proceedings. International Conference on Image Processing.

[3]  S. Shan,et al.  VIPLFaceNet: an open source deep face recognition SDK , 2016, Frontiers of Computer Science.

[4]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[5]  Rama Chellappa,et al.  Editorial: Special issue on ubiquitous biometrics , 2017, Pattern Recognit..

[6]  Stan Z. Li,et al.  An Analysis-by-Synthesis Method for Heterogeneous Face Biometrics , 2009, ICB.

[7]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[8]  Anil K. Jain,et al.  Heterogeneous Face Recognition: Matching NIR to Visible Light Images , 2010, 2010 20th International Conference on Pattern Recognition.

[9]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[10]  Shengcai Liao,et al.  Coupled Discriminant Analysis for Heterogeneous Face Recognition , 2012, IEEE Transactions on Information Forensics and Security.

[11]  Timothy Hospedales,et al.  A survey on heterogeneous face recognition: Sketch, infra-red, 3D and low-resolution , 2014, Image Vis. Comput..

[12]  Jiwen Lu,et al.  Coupled Discriminative Feature Learning for Heterogeneous Face Recognition , 2015, IEEE Transactions on Information Forensics and Security.

[13]  Tieniu Tan,et al.  Transferring deep representation for NIR-VIS heterogeneous face recognition , 2016, 2016 International Conference on Biometrics (ICB).

[14]  Jakob Verbeek,et al.  Heterogeneous Face Recognition with CNNs , 2016, ECCV Workshops.

[15]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[17]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[18]  Quan Pan,et al.  Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Tieniu Tan,et al.  Learning Invariant Deep Representation for NIR-VIS Face Recognition , 2017, AAAI.

[20]  Xuelong Li,et al.  Mutual Component Analysis for Heterogeneous Face Recognition , 2016, ACM Trans. Intell. Syst. Technol..

[21]  Wei Wang,et al.  Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Marios Savvides,et al.  NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Xiaogang Wang,et al.  Face sketch synthesis and recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Tieniu Tan,et al.  Coupled feature selection for cross-sensor iris recognition , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[25]  Gang Hua,et al.  Order-Preserving Wasserstein Distance for Sequence Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Guillermo Sapiro,et al.  Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .

[28]  Stan Z. Li,et al.  Shared representation learning for heterogenous face recognition , 2014, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[29]  Chi-Ho Chan,et al.  Evaluation of face recognition system in heterogeneous environments (visible vs NIR) , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[30]  Richa Singh,et al.  On Effectiveness of Histogram of Oriented Gradient Features for Visible to Near Infrared Face Matching , 2014, 2014 22nd International Conference on Pattern Recognition.

[31]  Yu-Chiang Frank Wang,et al.  Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Dahua Lin,et al.  Inter-modality Face Recognition , 2006, ECCV.

[33]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[34]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[35]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[36]  Shuicheng Yan,et al.  Robust Subspace Segmentation with Block-Diagonal Prior , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Jian-Huang Lai,et al.  Matching NIR Face to VIS Face Using Transduction , 2014, IEEE Transactions on Information Forensics and Security.

[38]  Chunna Tian,et al.  Face Sketch Synthesis Algorithm Based on E-HMM and Selective Ensemble , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Patrick J. Flynn,et al.  Multidimensional Scaling for Matching Low-Resolution Face Images , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Anil K. Jain,et al.  Heterogeneous Face Recognition Using Kernel Prototype Similarities , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[43]  Stan Z. Li,et al.  Coupled Spectral Regression for matching heterogeneous faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .

[45]  Matti Pietikäinen,et al.  Learning Discriminant Face Descriptor , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Ran He,et al.  Face shape recovery from a single image using CCA mapping between tensor spaces , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Shengcai Liao,et al.  Face Recognition by Discriminant Analysis with Gabor Tensor Representation , 2007, ICB.

[48]  Xuelong Li,et al.  Heterogeneous Face Recognition: A Common Encoding Feature Discriminant Approach , 2017, IEEE Transactions on Image Processing.

[49]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[50]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Matti Pietikäinen,et al.  Learning mappings for face synthesis from near infrared to visual light images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Rama Chellappa,et al.  Seeing the Forest from the Trees: A Holistic Approach to Near-Infrared Heterogeneous Face Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[53]  Ming Shao,et al.  Cross-Modality Feature Learning Through Generic Hierarchical Hyperlingual-Words , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[55]  Jiwen Lu,et al.  Learning modality-invariant features for heterogeneous face recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[56]  Xiao Wang,et al.  Regularized Discriminative Spectral Regression Method for Heterogeneous Face Matching , 2013, IEEE Transactions on Image Processing.

[57]  Shengcai Liao,et al.  Heterogeneous Face Recognition from Local Structures of Normalized Appearance , 2009, ICB.

[58]  Shengcai Liao,et al.  The CASIA NIR-VIS 2.0 Face Database , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.