Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation

The problem of cross-modality person re-identification has been receiving increasing attention recently, due to its practical significance. Motivated by the fact that human usually attend to the difference when they compare two similar objects, we propose a dual-path cross-modality feature learning framework which preserves intrinsic spatial strictures and attends to the difference of input cross-modality image pairs. Our framework is composed by two main components: a Dual-path Spatial-structure-preserving Common Space Network (DSCSN) and a Contrastive Correlation Network (CCN). The former embeds cross-modality images into a common 3D tensor space without losing spatial structures, while the latter extracts contrastive features by dynamically comparing input image pairs. Note that the representations generated for the input RGB and Infrared images are mutually dependant to each other. We conduct extensive experiments on two public available RGB-IR ReID datasets, SYSU-MM01 and RegDB, and our proposed method outperforms state-of-the-art algorithms by a large margin with both full and simplified evaluation modes.

[1]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[2]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xiaogang Wang,et al.  Person Search with Natural Language Description , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[5]  Shengcai Liao,et al.  Learning Multi-scale Block Local Binary Patterns for Face Recognition , 2007, ICB.

[6]  Zhedong Zheng,et al.  Dual-path Convolutional Image-Text Embeddings with Instance Loss , 2017, ACM Trans. Multim. Comput. Commun. Appl..

[7]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[8]  Hongtao Lu,et al.  Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[11]  Yi Yang,et al.  Dual-Path Convolutional Image-Text Embedding , 2017, ArXiv.

[12]  Xiang Li,et al.  Adversarial Open-World Person Re-Identification , 2018, ECCV.

[13]  Lei Zhang,et al.  Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Antoni B. Chan,et al.  Incorporating Side Information by Adaptive Convolution , 2017, International Journal of Computer Vision.

[15]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jie Yu,et al.  Attention-Based Natural Language Person Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[18]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Rongrong Ji,et al.  Cross-Modality Person Re-Identification with Generative Adversarial Training , 2018, IJCAI.

[20]  Shiguang Shan,et al.  Face Recognition with Contrastive Convolution , 2018, ECCV.

[21]  Jian-Huang Lai,et al.  Supplementary Material for “Unsupervised Person Re-identification by Soft Multilabel Learning” , 2019 .

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Tien Dat Nguyen,et al.  Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras , 2017, Sensors.

[25]  Yung-Yu Chuang,et al.  Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Shiguang Shan,et al.  VRSTC: Occlusion-Free Video Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[29]  Qi Wu,et al.  The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).