A Feature Map is Worth a Video Frame: Rethinking Convolutional Features for Visible-Infrared Person Re-identification

Visible-Infrared Person Re-identification (VI-ReID) aims to search for the identity of the same person across different spectra. The feature maps obtained from the convolutional layers are generally used for loss calculation in the later stages of the model in VI-ReID, but their role in the early and middle stages of the model remains unexplored. In this paper, we propose a novel Rethinking Convolutional Features (ReCF) approach for VI-ReID. ReCF consists of two modules: Middle Feature Generation (MFG), which utilizes the feature maps in the early stage to reduce significant modality gap, and Temporal Feature Aggregation (TFA), which uses the feature maps in the middle stage to aggregate multi-level features for enlarging the receptive field. MFG generates middle modality features in the form of a learnable convolution layer as a bridge between RGB and IR modalities, which is more flexible than using fixed-parameter grayscale images and yields a better middle modality to further reduce the modality gap. TFA first treats the convolution process as a video sequence, and the feature map of each convolution layer can be considered a worthwhile video frame. Based on this, we can obtain a multi-level receptive field and a temporal refinement. In addition, we introduce a color-unrelated loss and a modality-unrelated loss to constrain the modality features for providing a common feature representation space. Experimental results on the challenging VI-ReID datasets demonstrate that our proposed method achieves state-of-the-art performance.

[1]  Xin Yuan,et al.  Searching Parameterized Retrieval & Verification Loss for Re-Identification , 2023, IEEE Journal of Selected Topics in Signal Processing.

[2]  S. Su,et al.  VEFNet: an Event-RGB Cross Modality Fusion Network for Visual Place Recognition , 2022, 2022 IEEE International Conference on Image Processing (ICIP).

[3]  X. Zhong,et al.  Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search , 2022, ACM Trans. Multim. Comput. Commun. Appl..

[4]  Jungong Han,et al.  Deep learning for visible-infrared cross-modality person re-identification: A comprehensive review , 2022, Inf. Fusion.

[5]  Rafael M. O. Cruz,et al.  Visible-Infrared Person Re-Identification Using Privileged Intermediate Information , 2022, ECCV Workshops.

[6]  Qieshi Zhang,et al.  Dual-stream cross-modality fusion transformer for RGB-D action recognition , 2022, Knowl. Based Syst..

[7]  Xiangmin Xu,et al.  Context Sensing Attention Network for Video-based Person Re-identification , 2022, ACM Trans. Multim. Comput. Commun. Appl..

[8]  Guangming Lu,et al.  Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jungong Han,et al.  FMCNet: Feature-Level Modality Compensation for Visible-Infrared Person Re-Identification , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jinqiao Wang,et al.  Hybrid Modality Metric Learning for Visible-Infrared Person Re-Identification , 2022, ACM Trans. Multim. Comput. Commun. Appl..

[11]  Bumsub Ham,et al.  Video-based Person Re-identification with Spatial and Temporal Memory Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Bumsub Ham,et al.  Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Zhengtao Yu,et al.  Dual-Stream Reciprocal Disentanglement Learning for Domain Adaption Person Re-Identification , 2021, Knowl. Based Syst..

[14]  Yinghuan Shi,et al.  GreyReID: A Novel Two-stream Deep Framework with RGB-grey Information for Person Re-identification , 2021, ACM Trans. Multim. Comput. Commun. Appl..

[15]  Yi Yu,et al.  Correlation Discrepancy Insight Network for Video Re-identification , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[16]  Hantao Yao,et al.  Part-based Structured Representation Learning for Person Re-identification , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[17]  Xiansheng Hua,et al.  Part-Aware Attention Network for Person Re-identification , 2020, ACCV.

[18]  Paolo Rota,et al.  Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification , 2020, ACM Multimedia.

[19]  Mang Ye,et al.  Augmentation Invariant and Instance Spreading Feature for Softmax Embedding , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ling Shao,et al.  Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification , 2020, ECCV.

[21]  Xilin Chen,et al.  Appearance-Preserving 3D Convolution for Video-based Person Re-identification , 2020, ECCV.

[22]  Mang Ye,et al.  Cross-Modality Person Re-Identification via Modality-Aware Collaborative Ensemble Learning , 2020, IEEE Transactions on Image Processing.

[23]  Mang Ye,et al.  Probabilistic Structural Latent Representation for Unsupervised Embedding , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Xiaopeng Hong,et al.  Infrared-Visible Cross-Modal Person Re-Identification with an X Modality , 2020, AAAI.

[25]  Bin Liu,et al.  Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jian Cheng,et al.  Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification , 2020, AAAI.

[27]  Shaogang Gong,et al.  RGB-IR Person Re-identification by Cross-Modality Similarity Preservation , 2020, International Journal of Computer Vision.

[28]  Tao Xiang,et al.  Deep Learning for Person Re-Identification: A Survey and Outlook , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Mang Ye,et al.  Modality-aware Collaborative Learning for Visible Thermal Person Re-Identification , 2019, ACM Multimedia.

[30]  Yang Yang,et al.  RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Yu-Chiang Frank Wang,et al.  Recover and Identify: A Generative Dual Model for Cross-Resolution Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Jian Cheng,et al.  Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-Identification , 2019, Neurocomputing.

[33]  Jie Li,et al.  HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-Identification , 2019, AAAI.

[34]  Yu Qiao,et al.  Residual Compensation Networks for Heterogeneous Face Recognition , 2019, AAAI.

[35]  Wei Jiang,et al.  A Strong Baseline and Batch Normalization Neck for Deep Person Re-Identification , 2019, IEEE Transactions on Multimedia.

[36]  Xian-Sheng Hua,et al.  Multi-level Similarity Perception Network for Person Re-identification , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[37]  Yung-Yu Chuang,et al.  Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yongdong Zhang,et al.  Dense 3D-Convolutional Neural Network for Person Re-Identification in Videos , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[39]  Kaiming He,et al.  Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Tiejun Huang,et al.  Multi-scale 3D Convolution Network for Video Based Person Re-Identification , 2018, AAAI.

[41]  Rongrong Ji,et al.  Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Xiaodong Yu,et al.  Learning Bidirectional Temporal Cues for Video-Based Person Re-Identification , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Walter G. Kropatsch,et al.  ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-identification in Multispectral Dataset , 2018, ECCV Workshops.

[44]  Shaogang Gong,et al.  Unsupervised Person Re-identification by Deep Learning Tracklet Association , 2018, ECCV.

[45]  Rongrong Ji,et al.  Cross-Modality Person Re-Identification with Generative Adversarial Training , 2018, IJCAI.

[46]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[47]  Xiaogang Wang,et al.  Video Person Re-identification with Competitive Snippet-Similarity Aggregation and Co-attentive Snippet Embedding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Yu Wu,et al.  Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Tomer Michaeli,et al.  Multi-scale Weighted Nuclear Norm Image Restoration , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  R. Nevatia,et al.  Revisiting Temporal Modeling for Video-based Person ReID , 2018, ArXiv.

[51]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[52]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.

[53]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[55]  Edward J. Delp,et al.  A Two Stream Siamese Convolutional Neural Network for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Shaogang Gong,et al.  Person Re-identification by Deep Learning Multi-scale Representations , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[57]  Tao Xiang,et al.  Multi-scale Deep Learning Architectures for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[58]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[60]  Hantao Yao,et al.  Deep Representation Learning With Part Loss for Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[61]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Tien Dat Nguyen,et al.  Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras , 2017, Sensors.

[63]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[65]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[66]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[67]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[70]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[72]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Aaron C. Courville,et al.  Generative adversarial networks , 2014, Commun. ACM.

[74]  Steve Branson,et al.  Efficient Large-Scale Structured Learning , 2013, CVPR.

[75]  S. Gong,et al.  Reidentification by Relative Distance Comparison , 2013 .

[76]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[79]  Zheng Wang,et al.  Sampling and Re-Weighting: Towards Diverse Frame Aware Unsupervised Video Person Re-Identification , 2022, IEEE Transactions on Multimedia.

[80]  Ling Shao,et al.  Visible-Infrared Person Re-Identification via Homogeneous Augmented Tri-Modal Learning , 2021, IEEE Transactions on Information Forensics and Security.

[81]  Pong C. Yuen,et al.  Bi-Directional Center-Constrained Top-Ranking for Visible Thermal Person Re-Identification , 2020, IEEE Transactions on Information Forensics and Security.

[82]  Jianhuang Lai,et al.  Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification , 2020, IEEE Transactions on Image Processing.

[83]  Zongyao He,et al.  Deep Feature Fusion with Multiple Granularity for Vehicle Re-identification , 2019, CVPR Workshops.