MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space, which ignore the single space of each modality in the shallow layers. To solve it, in this paper, we present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space. Firstly, based on the observation that edge information is modality-invariant, we propose an edge features enhancement module to enhance the modality-sharable features in each single-modality space. Specifically, we design a perceptual edge features (PEF) loss after the edge fusion strategy analysis. According to our knowledge, this is the first work that proposes explicit optimization in the single-modality feature space on cross-modality ReID task. Moreover, to increase the difference between cross-modality distance and class distance, we introduce a novel cross-modality contrastive-center (CMCC) loss into the modality-joint constraints in the common feature space. The PEF loss and CMCC loss jointly optimize the model in an end-to-end manner, which markedly improves the network's performance. Extensive experiments demonstrate that the proposed model significantly outperforms state-of-the-art methods on both the SYSU-MM01 and RegDB datasets.

[1]  Chuang Gan,et al.  Self-Supervised Moving Vehicle Tracking With Stereo Sound , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Jie Li,et al.  HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-Identification , 2019, AAAI.

[3]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Yang Li,et al.  RGB-IR Cross-modality Person ReID based on Teacher-Student GAN Model , 2020, Pattern Recognit. Lett..

[6]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[7]  Wu Liu,et al.  Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking , 2021, IEEE Transactions on Multimedia.

[8]  Tien Dat Nguyen,et al.  Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras , 2017, Sensors.

[9]  Jian Cheng,et al.  Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification , 2020, AAAI.

[10]  Mang Ye,et al.  Modality-aware Collaborative Learning for Visible Thermal Person Re-Identification , 2019, ACM Multimedia.

[11]  Yang Yang,et al.  RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Hao Li,et al.  HPILN: A feature learning framework for cross-modality person re-identification , 2019, IET Image Process..

[13]  Jian Cheng,et al.  Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-Identification , 2019, Neurocomputing.

[14]  Jianhuang Lai,et al.  Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification , 2020, IEEE Transactions on Image Processing.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Ling Shao,et al.  Visible-Infrared Person Re-Identification via Homogeneous Augmented Tri-Modal Learning , 2021, IEEE Transactions on Information Forensics and Security.

[17]  Tao Mei,et al.  Group-aware Label Transfer for Domain Adaptive Person Re-identification , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Tianbao Yang,et al.  Learning Attributes Equals Multi-Source Domain Generalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jian Cheng,et al.  NormFace: L2 Hypersphere Embedding for Face Verification , 2017, ACM Multimedia.

[20]  Rongrong Ji,et al.  Cross-Modality Person Re-Identification with Generative Adversarial Training , 2018, IJCAI.

[21]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Dapeng Tao,et al.  Hetero-Center Loss for Cross-Modality Person Re-Identification , 2019, Neurocomputing.

[23]  Shijian Lu,et al.  A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification , 2020, IJCAI.

[24]  Zhangyang Wang,et al.  In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Paolo Rota,et al.  Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification , 2020, ACM Multimedia.

[26]  Yi Yang,et al.  Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Shizhou Zhang,et al.  Attend to the Difference: Cross-Modality Person Re-Identification via Contrastive Correlation , 2019, IEEE Transactions on Image Processing.

[28]  Song Bai,et al.  Triplet-Center Loss for Multi-view 3D Object Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[31]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[32]  Shinrichi Satoh,et al.  SDL: Spectrum-Disentangled Representation Learning for Visible-Infrared Person Re-Identification , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Wei Jiang,et al.  SphereReID: Deep Hypersphere Manifold Embedding for Person Re-Identification , 2018, J. Vis. Commun. Image Represent..

[34]  Bin Liu,et al.  Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[36]  Ling Shao,et al.  Deep Learning for Person Re-Identification: A Survey and Outlook , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Tao Mei,et al.  MetaSearch: Incremental Product Search via Deep Meta-Learning , 2020, IEEE Transactions on Image Processing.

[38]  Aaron C. Courville,et al.  Generative adversarial networks , 2020 .

[39]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[40]  Wei Jiang,et al.  Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[42]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[44]  Pong C. Yuen,et al.  Bi-Directional Center-Constrained Top-Ranking for Visible Thermal Person Re-Identification , 2020, IEEE Transactions on Information Forensics and Security.

[45]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Kang Ryoung Park,et al.  Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input , 2019, IEEE Access.

[47]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[48]  Hong Liu,et al.  Bi-Directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification , 2020, IEEE Transactions on Image Processing.

[49]  Yung-Yu Chuang,et al.  Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Yihong Gong,et al.  Co-Attentive Lifting for Infrared-Visible Person Re-Identification , 2020, ACM Multimedia.

[51]  Xuan Qi,et al.  HPILN: a feature learning framework for cross-modality person re-identification , 2019 .

[52]  Gang Wang,et al.  Gated Siamese Convolutional Neural Network Architecture for Human Re-identification , 2016, ECCV.

[53]  Xiaopeng Hong,et al.  Infrared-Visible Cross-Modal Person Re-Identification with an X Modality , 2020, AAAI.

[54]  Changick Kim,et al.  Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).