RGB-IR Person Re-identification by Cross-Modality Similarity Preservation

Person re-identification (Re-ID) is an important problem in video surveillance for matching pedestrian images across non-overlapping camera views. Currently, most works focus on RGB-based Re-ID. However, RGB images are not well suited to a dark environment; consequently, infrared (IR) imaging becomes necessary for indoor scenes with low lighting and 24-h outdoor scene surveillance systems. In such scenarios, matching needs to be performed between RGB images and IR images, which exhibit different visual characteristics; this cross-modality matching problem is more challenging than RGB-based Re-ID due to the lack of visible colour information in IR images. To address this challenge, we study the RGB-IR cross-modality Re-ID (RGB-IR Re-ID) problem. Rather than applying existing cross-modality matching models that operate under the assumption of identical data distributions between training and testing sets to handle the discrepancy between RGB and IR modalities for Re-ID, we cast learning shared knowledge for cross-modality matching as the problem of cross-modality similarity preservation. We exploit same-modality similarity as the constraint to guide the learning of cross-modality similarity along with the alleviation of modality-specific information, and finally propose a Focal Modality-Aware Similarity-Preserving Loss. To further assist the feature extractor in extracting shared knowledge, we design a modality-gated node as a universal representation of both modality-specific and shared structures for constructing a structure-learnable feature extractor called Modality-Gated Extractor. For validation, we construct a new multi-modality Re-ID dataset, called SYSU-MM01, to enable wider study of this problem. Extensive experiments on this SYSU-MM01 dataset show the effectiveness of our method. Download link of dataset: https://github.com/wuancong/SYSU-MM01 .

[1]  Yang Li,et al.  Viewpoint Invariant Human Re-Identification in Camera Networks Using Pose Priors and Subject-Discriminative Features , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[4]  Michael Arens,et al.  Local Feature Based Person Reidentification in Infrared Image Sequences , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[5]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[7]  Wei-Shi Zheng,et al.  Unsupervised Person Re-Identification by Deep Asymmetric Metric Embedding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ling Shao,et al.  Fast Person Re-identification via Cross-Camera Semantic Binary Transformation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[12]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[13]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[15]  Shiliang Zhang,et al.  Deep Attributes Driven Multi-Camera Person Re-identification , 2016, ECCV.

[16]  Jian-Huang Lai,et al.  Mirror Representation for Modeling View-Specific Transform in Person Re-Identification , 2015, IJCAI.

[17]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[18]  Nanning Zheng,et al.  Similarity learning on an explicit polynomial kernel feature map for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Xiang Li,et al.  Top-Push Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Wei-Shi Zheng,et al.  Fine-Grained Person Re-identification , 2020, International Journal of Computer Vision.

[22]  Xiaogang Wang,et al.  Human Reidentification with Transferred Metric Learning , 2012, ACCV.

[23]  Wei-Shi Zheng,et al.  Fast Open-World Person Re-Identification , 2018, IEEE Transactions on Image Processing.

[24]  Xiao-Yuan Jing,et al.  Super-resolution Person re-identification with semi-coupled low-rank discriminant dictionary learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Xiaogang Wang,et al.  Joint Detection and Identification Feature Learning for Person Search , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Li Fei-Fei,et al.  Recurrent Attention Models for Depth-Based Person Identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Shaogang Gong,et al.  Associating Groups of People , 2009, BMVC.

[32]  Pong C. Yuen,et al.  Semi-supervised Region Metric Learning for Person Re-identification , 2018, International Journal of Computer Vision.

[33]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[36]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[38]  Shaogang Gong,et al.  Towards Open-World Person Re-Identification by One-Shot Group-Based Verification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Rongrong Ji,et al.  Cross-Modality Person Re-Identification with Generative Adversarial Training , 2018, IJCAI.

[41]  Shiguang Shan,et al.  Multi-view Deep Network for Cross-View Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[43]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[45]  Tao Xiang,et al.  Multi-Scale Learning for Low-Resolution Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[48]  Jian-Huang Lai,et al.  Multi-shot Person Re-identification with Automatic Ambiguity Inference and Removal , 2014, 2014 22nd International Conference on Pattern Recognition.

[49]  Yizhou Wang,et al.  Quantized Correlation Hashing for Fast Cross-Modal Search , 2015, IJCAI.

[50]  Anton van den Hengel,et al.  Learning to rank in person re-identification with metric ensembles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Jian-Huang Lai,et al.  Person Re-Identification by Camera Correlation Aware Feature Augmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[53]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Lei Zhang,et al.  Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Ling Shao,et al.  Cross-Modality Submodular Dictionary Learning for Information Retrieval , 2014, CIKM.

[57]  Stan Z. Li,et al.  Coupled Spectral Regression for matching heterogeneous faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Wei-Shi Zheng,et al.  Person Re-Identification by Contour Sketch Under Moderate Clothing Change , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Tieniu Tan,et al.  Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Fei Xiong,et al.  Person Re-Identification Using Kernel-Based Metric Learning Methods , 2014, ECCV.

[61]  Beng Chin Ooi,et al.  Effective deep learning-based multi-modal retrieval , 2015, The VLDB Journal.

[62]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[63]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[65]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66]  Xiaogang Wang,et al.  Person Re-Identification by Saliency Learning , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Jian-Huang Lai,et al.  An Asymmetric Distance Model for Cross-View Feature Mapping in Person Reidentification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[68]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.

[69]  Alberto Del Bimbo,et al.  Person Re-Identification by Iterative Re-Weighted Sparse Ranking , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Chunxiao Liu,et al.  Person Re-identification: What Features Are Important? , 2012, ECCV Workshops.

[71]  Tien Dat Nguyen,et al.  Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras , 2017, Sensors.

[72]  Shaogang Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[73]  Dahua Lin,et al.  Inter-modality Face Recognition , 2006, ECCV.

[74]  Xiang Li,et al.  Cross-Scenario Transfer Person Reidentification , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[75]  Antonio Torralba,et al.  Learning Aligned Cross-Modal Representations from Weakly Aligned Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Nicu Sebe,et al.  Group Consistent Similarity Learning via Deep CRF for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[78]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[79]  S. Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[80]  Wei-Shi Zheng,et al.  Cross-View Asymmetric Metric Learning for Unsupervised Person Re-Identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[81]  Jian-Huang Lai,et al.  Matching NIR Face to VIS Face Using Transduction , 2014, IEEE Transactions on Information Forensics and Security.

[82]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[83]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[84]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[85]  Tieniu Tan,et al.  Learning Invariant Deep Representation for NIR-VIS Face Recognition , 2017, AAAI.

[86]  Tao Xiang,et al.  Transferring a semantic representation for person re-identification and search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[87]  Yang Li,et al.  Person Re-Identification with Discriminatively Trained Viewpoint Invariant Dictionaries , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[88]  Slawomir Bak,et al.  One-Shot Metric Learning for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Yao Zhao,et al.  Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.

[90]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[91]  Sergio A. Velastin,et al.  Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[92]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[93]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[95]  Liang Wang,et al.  Mask-Guided Contrastive Attention Model for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[96]  Shaogang Gong,et al.  Unsupervised Person Re-identification by Deep Learning Tracklet Association , 2018, ECCV.

[97]  Jian-Huang Lai,et al.  Robust Depth-Based Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[98]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[99]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[100]  Shengcai Liao,et al.  Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[101]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[102]  Xiang Li,et al.  Partial Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).