Norm-Aware Embedding for Efficient Person Search and Tracking

Person Search is a practically relevant task that aims to jointly solve Person Detection and Person Re-identification (re-ID). Specifically, it requires to find and locate all instances with the same identity as the query person in a set of panoramic gallery images. One major challenge comes from the contradictory goals of the two sub-tasks, i.e., person detection focuses on finding the commonness of all persons while person re-ID handles the differences among multiple identities. Therefore, it is crucial to reconcile the relationship between the two sub-tasks in a joint person search model. To this end, We present a novel approach called Norm-Aware Embedding to disentangle the person embedding into norm and angle for detection and re-ID respectively, allowing for both effective and efficient multi-task training. We further extend the proposal-level person embedding to pixel-level, whose discrimination ability is less affected by mis-alignment. We outperform other one-step methods by a large margin and achieve comparable performance to two-step methods on both CUHK-SYSU and PRW. Also, Our method is easy to train and resource-friendly, running at 12 fps on a single GPU.

[1]  Q. Tian,et al.  GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval , 2017, ACM Multimedia.

[2]  Bingbing Ni,et al.  Learning Context Graph for Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[6]  Bernt Schiele,et al.  Towards Reaching Human Performance in Pedestrian Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Jiang,et al.  SphereReID: Deep Hypersphere Manifold Embedding for Person Re-Identification , 2018, J. Vis. Commun. Image Represent..

[10]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[11]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jitendra Malik,et al.  Deformable part models are convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[14]  Changxin Gao,et al.  Re-ID Driven Localization Refinement for Person Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Bingpeng Ma,et al.  Person Search in a Scene by Jointly Modeling People Commonness and Person Uniqueness , 2014, ACM Multimedia.

[18]  B. Schiele,et al.  How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  LinLiang,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015 .

[21]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Armin B. Cremers,et al.  Informed Haar-Like Features Improve Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yunchao Wei,et al.  IAN: The Individual Aggregation Network for Person Search , 2017, Pattern Recognit..

[28]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[31]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[32]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Xiaogang Wang,et al.  Joint Detection and Identification Feature Learning for Person Search , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jian Yang,et al.  Person Search via A Mask-Guided Two-Stream CNN Model , 2018, ECCV.

[39]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[40]  Tao Xiang,et al.  Multi-Scale Learning for Low-Resolution Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Hantao Yao,et al.  Deep Representation Learning With Part Loss for Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[42]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  Yi Yang,et al.  RCAA: Relational Context-Aware Agents for Person Search , 2018, ECCV.

[45]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[46]  Jing Xu,et al.  Attention-Aware Compositional Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[48]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[49]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Lei Zhang,et al.  One-shot Face Recognition by Promoting Underrepresented Classes , 2017, ArXiv.

[51]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Xu Lan,et al.  Person Search by Multi-Scale Matching , 2018, ECCV.

[54]  Lei Zhang,et al.  Homocentric Hypersphere Feature Embedding for Person Re-Identification , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[55]  Bo Zhao,et al.  Neural Person Search Machines , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[58]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Hao Wang,et al.  Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition , 2018, ECCV.

[60]  Federico Tombari,et al.  Query-Guided End-To-End Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).