ES-Net: Erasing Salient Parts to Learn More in Re-Identification

As an instance-level recognition problem, re-identification (re-ID) requires models to capture diverse features. However, with continuous training, re-ID models pay more and more attention to the salient areas. As a result, the model may only focus on few small regions with salient representations and ignore other important information. This phenomenon leads to inferior performance, especially when models are evaluated on small inter-identity variation data. In this paper, we propose a novel network, Erasing-Salient Net (ES-Net), to learn comprehensive features by erasing the salient areas in an image. ES-Net proposes a novel method to locate the salient areas by the confidence of objects and erases them efficiently in a training batch. Meanwhile, to mitigate the over-erasing problem, this paper uses a trainable pooling layer P-pooling that generalizes global max and global average pooling. Experiments are conducted on two specific re-identification tasks (i.e., Person re-ID, Vehicle re-ID). Our ES-Net outperforms state-of-the-art methods on three Person re-ID benchmarks and two Vehicle re-ID benchmarks. Specifically, mAP / Rank-1 rate: 88.6% / 95.7% on Market1501, 78.8% / 89.2% on DuckMTMC-reID, 57.3% / 80.9% on MSMT17, 81.9% / 97.0% on Veri-776, respectively. Rank-1 / Rank-5 rate: 83.6% / 96.9% on VehicleID (Small), 79.9% / 93.5% on VehicleID (Medium), 76.9% / 90.7% on VehicleID (Large), respectively. Moreover, the visualized salient areas show human-interpretable visual explanations for the ranking results.

[1]  Shiguang Shan,et al.  Interaction-And-Aggregation Network for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jian-Huang Lai,et al.  Robust Depth-Based Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[3]  Kaiqi Huang,et al.  Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Tao Mei,et al.  A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance , 2016, ECCV.

[5]  Shiliang Zhang,et al.  RAM: A Region-Aware Deep Model for Vehicle Re-Identification , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Tao Mei,et al.  PROVID: Progressive and Multimodal Vehicle Reidentification for Large-Scale Urban Surveillance , 2018, IEEE Transactions on Multimedia.

[7]  Yichen Wei,et al.  Vehicle Re-Identification With Viewpoint-Aware Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Wei Jiang,et al.  Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[13]  Mang Ye,et al.  Cross-Modality Person Re-Identification via Modality-Aware Collaborative Ensemble Learning , 2020, IEEE Transactions on Image Processing.

[14]  Ziyan Wu,et al.  Re-Identification With Consistent Attentive Siamese Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[17]  Mang Ye,et al.  Augmentation Invariant and Instance Spreading Feature for Softmax Embedding , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Rama Chellappa,et al.  A Dual-Path Model With Adaptive Attention for Vehicle Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Ling Shao,et al.  Deep Learning for Person Re-Identification: A Survey and Outlook , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  S. Gong,et al.  KANACI, ZHU, GONG: VEHICLE RE-ID BY FINE-GRAINED CROSS-LEVEL DEEP LEARNING1 Vehicle Re-Identification by Fine-Grained Cross-Level Deep Learning , 2017 .

[24]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[25]  Shuo Wang,et al.  PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Neil Genzlinger A. and Q , 2006 .

[27]  Wei Zeng,et al.  Exploiting Multi-grain Ranking Constraints for Precisely Searching Visually-similar Vehicles , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[30]  W. Marsden I and J , 2012 .

[31]  Rongrong Ji,et al.  Salience-Guided Cascaded Suppression Network for Person Re-Identification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Wei Jiang,et al.  Stripe-based and attribute-aware network: a two-branch deep model for vehicle re-identification , 2019, ArXiv.

[33]  Andrea Cavallaro,et al.  Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[36]  Bing He,et al.  Part-Regularized Near-Duplicate Vehicle Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Weihong Deng,et al.  Mixed High-Order Attention Network for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Kim-Hui Yap,et al.  AANet: Attribute Attention Network for Person Re-Identifications , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Zuozhuo Dai,et al.  Batch DropBlock Network for Person Re-Identification and Beyond , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Pong C. Yuen,et al.  Dynamic Graph Co-Matching for Unsupervised Video-Based Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[43]  Shiguang Shan,et al.  VRSTC: Occlusion-Free Video Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Xiaogang Wang,et al.  Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[46]  Atsuto Maki,et al.  Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.

[47]  Victor S. Lempitsky,et al.  Aggregating Deep Convolutional Features for Image Retrieval , 2015, ArXiv.

[48]  Gang Wang,et al.  Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[50]  Ling Shao,et al.  Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Ling Shao,et al.  Viewpoint-Aware Attentive Multi-view Inference for Vehicle Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[53]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.