Pose Agnostic Cross-spectral Hallucination via Disentangling Independent Factors

The cross-sensor gap is one of the challenges that arise much research interests in Heterogeneous Face Recognition (HFR). Although recent methods have attempted to fill the gap with deep generative networks, most of them suffered from the inevitable misalignment between different face modalities. Instead of imaging sensors, the misalignment primarily results from geometric variations (e.g., pose and expression) on faces that stay independent from spectrum. Rather than building a monolithic but complex structure, this paper proposes a Pose Agnostic Cross-spectral Hallucination (PACH) approach to disentangle the independent factors and deal with them in individual stages. In the first stage, an Unsupervised Face Alignment (UFA) network is designed to align the near-infrared (NIR) and visible (VIS) images in a generative way, where 3D information is effectively utilized as the pose guidance. Thus the task of the second stage becomes spectrum transform with paired data. We develop a Texture Prior Synthesis (TPS) network to accomplish complexion control and consequently generate more realistic VIS images than existing methods. Experiments on three challenging NIR-VIS datasets verify the effectiveness of our approach in producing visually appealing images and achieving state-of-the-art performance in cross-spectral HFR.

[1]  D. Jacobs,et al.  Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.

[2]  Tieniu Tan,et al.  Transferring deep representation for NIR-VIS heterogeneous face recognition , 2016, 2016 International Conference on Biometrics (ICB).

[3]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Xiaogang Wang,et al.  Face sketch synthesis and recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Ran He,et al.  Pose-preserving Cross Spectral Face Hallucination , 2019, IJCAI.

[7]  Matti Pietikäinen,et al.  Learning mappings for face synthesis from near infrared to visual light images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Tieniu Tan,et al.  A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.

[10]  Man Zhang,et al.  Adversarial Discriminative Heterogeneous Face Recognition , 2017, AAAI.

[11]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[12]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[13]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xuelong Li,et al.  Heterogeneous Face Recognition: A Common Encoding Feature Discriminant Approach , 2017, IEEE Transactions on Image Processing.

[15]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[16]  Ran He,et al.  Dual Variational Generation for Low-Shot Heterogeneous Face Recognition , 2019, NeurIPS.

[17]  Stan Z. Li,et al.  Heterogeneous face image matching using multi-scale features , 2012, 2012 5th IAPR International Conference on Biometrics (ICB).

[18]  Qian Zhang,et al.  High Fidelity Face Manipulation with Extreme Pose and Expression , 2019, ArXiv.

[19]  Stefanos Zafeiriou,et al.  Optimal UV spaces for facial morphable model construction , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[20]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[22]  Marios Savvides,et al.  NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Hamed Kiani Galoogahi,et al.  Face sketch recognition by Local Radon Binary Pattern: LRBP , 2012, 2012 19th IEEE International Conference on Image Processing.

[24]  Guillermo Sapiro,et al.  Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[26]  Dong Yi,et al.  Face Matching Between Near Infrared and Visible Light Images , 2007, ICB.

[27]  Xiao Wang,et al.  Regularized Discriminative Spectral Regression Method for Heterogeneous Face Matching , 2013, IEEE Transactions on Image Processing.

[28]  Shengcai Liao,et al.  Heterogeneous Face Recognition from Local Structures of Normalized Appearance , 2009, ICB.

[29]  Shengcai Liao,et al.  The CASIA NIR-VIS 2.0 Face Database , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[30]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Zhenan Sun,et al.  Pose-Guided Photorealistic Face Rotation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Tieniu Tan,et al.  Learning Invariant Deep Representation for NIR-VIS Face Recognition , 2017, AAAI.

[33]  Zhenan Sun,et al.  Disentangled Variational Representation for Heterogeneous Face Recognition , 2018, AAAI.