Learning Discriminative Part Features Through Attentions For Effective And Scalable Person Search

This paper proposes a new method for person search, the task of detecting a specific person exemplified by a query image from a gallery of scene images. Current state-of-the-art techniques in person search demonstrate impressive performance, but are limited in terms of efficiency and scalability since they require multiple models and/or have to re-process gallery images per query. We argue that a concise framework with a single neural network can achieve both of scalability and performance at once. In our framework, the network detects people and extracts their appearance features so that person search is done by finding the person closest to the query in the feature space. For performance, we focus on the quality of the person appearance features: Our network is designed and trained to produce person features that are discriminative, fine-grained, adaptive to appearance variations, and robust against person localization errors. To this end, we design channel attention and part-wise spatial attention modules as well as a loss for learning discriminative features. Our framework outperforms current state of the art on the PRW benchmark even with the concise pipeline based on a single network.

[1]  Bingbing Ni,et al.  Learning Context Graph for Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[4]  Xiaogang Wang,et al.  Joint Detection and Identification Feature Learning for Person Search , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jian Yang,et al.  Person Search via A Mask-Guided Two-Stream CNN Model , 2018, ECCV.

[6]  Xu Lan,et al.  Person Search by Multi-Scale Matching , 2018, ECCV.

[7]  Yi Yang,et al.  RCAA: Relational Context-Aware Agents for Person Search , 2018, ECCV.

[8]  Federico Tombari,et al.  Query-Guided End-To-End Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[11]  Yunchao Wei,et al.  IAN: The Individual Aggregation Network for Person Search , 2017, Pattern Recognit..

[12]  Bo Zhao,et al.  Neural Person Search Machines , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).