Deep hard modality alignment for visible thermal person re-identification

Abstract Visible Thermal Person Re-Identification (VTReID) is essentially a cross-modality problem and widely encountered in real night-time surveillance scenarios, which is still in need of vigorous performance improvement. In this work, we design a simple but effective Hard Modality Alignment Network (HMAN) framework to learn modality-robust features. Since current VTReID works do not consider the cross-modality discrepancy imbalance, their models are likely to suffer from the selective alignment behavior. To solve this problem, we propose a novel Hard Modality Alignment (HMA) loss to simultaneously balance and reduce the modality discrepancies. Specifically, we mine the hard feature subspace with large modality discrepancies and abandon the easy feature subspace with small modality discrepancies to make the modality distributions more distinguishable. For mitigating the discrepancy imbalance, we pay more attention on reducing the modality discrepancies of the hard feature subspace than that of the easy feature subspace. Furthermore, we propose to jointly relieve the modality heterogeneity of global and local visual semantics to further boost the cross-modality retrieval performance. This paper experimentally demonstrates the effectiveness of the proposed method, achieving superior performance over the state-of-the-art methods on RegDB and SYSU-MM01 datasets.

[1]  Rui Yu,et al.  Hard-Aware Point-to-Set Deep Metric for Person Re-identification , 2018, ECCV.

[2]  Mang Ye,et al.  Modality-aware Collaborative Learning for Visible Thermal Person Re-Identification , 2019, ACM Multimedia.

[3]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jiwen Lu,et al.  Sharable and Individual Multi-View Metric Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yue Gao,et al.  Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing , 2016, IEEE Transactions on Image Processing.

[6]  Adriana Kovashka,et al.  Cross-Modality Personalization for Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Xuelong Li,et al.  Heterogeneous Face Recognition: A Common Encoding Feature Discriminant Approach , 2017, IEEE Transactions on Image Processing.

[8]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[9]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Rongrong Ji,et al.  Cross-Modality Person Re-Identification with Generative Adversarial Training , 2018, IJCAI.

[11]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[12]  Tieniu Tan,et al.  Coupled Deep Learning for Heterogeneous Face Recognition , 2017, AAAI.

[13]  Tieniu Tan,et al.  Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Tien Dat Nguyen,et al.  Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras , 2017, Sensors.

[15]  Yung-Yu Chuang,et al.  Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jie Li,et al.  HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-Identification , 2019, AAAI.

[17]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[18]  Jiwen Lu,et al.  Uniform and Variational Deep Learning for RGB-D Object Recognition and Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[19]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face and Kinship Verification , 2017, IEEE Transactions on Image Processing.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jianhuang Lai,et al.  Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification , 2020, IEEE Transactions on Image Processing.

[22]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yunchao Wei,et al.  Horizontal Pyramid Matching for Person Re-identification , 2018, AAAI.

[24]  Fei Su,et al.  Deep class-skewed learning for face recognition , 2019, Neurocomputing.

[25]  Kang Ryoung Park,et al.  Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input , 2019, IEEE Access.

[26]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[27]  Tao Mei,et al.  Part-Aligned Bilinear Representations for Person Re-identification , 2018, ECCV.

[28]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[29]  Huchuan Lu,et al.  Deep Cross-Modal Projection Learning for Image-Text Matching , 2018, ECCV.

[30]  Cheng Wang,et al.  Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-Identification , 2018, ECCV.

[31]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[32]  Yang Yang,et al.  RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.