Part-Aligned Bilinear Representations for Person Re-identification

We propose a novel network that learns a part-aligned representation for person re-identification. It handles the body part misalignment problem, that is, body parts are misaligned across human detections due to pose/viewpoint change and unreliable detection. Our model consists of a two-stream network (one stream for appearance map extraction and the other one for body part map extraction) and a bilinear-pooling layer that generates and spatially pools a part-aligned map. Each local feature of the part-aligned map is obtained by a bilinear mapping of the corresponding local appearance and body part descriptors. Our new representation leads to a robust image matching similarity, which is equivalent to an aggregation of the local similarities of the corresponding body parts combined with the weighted appearance similarity. This part-aligned representation reduces the part misalignment problem significantly. Our approach is also advantageous over other pose-guided representations (e.g., extracting representations over the bounding box of each body part) by learning part descriptors optimal for person re-identification. For training the network, our approach does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network, and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demonstrating its superiority over the state-of-the-art methods on the standard benchmark datasets, including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.

[1]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Huchuan Lu,et al.  Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Anton van den Hengel,et al.  Learning to rank in person re-identification with metric ensembles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[6]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[7]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[8]  Huchuan Lu,et al.  Sample-Specific SVM Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Liang Lin,et al.  Human Re-identification by Matching Compositional Template with Cluster Sampling , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Shaogang Gong,et al.  Person Re-identification by Deep Learning Multi-scale Representations , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[12]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[14]  Xiang Li,et al.  Partial Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Tao Xiang,et al.  Deep Transfer Learning for Person Re-Identification , 2016, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).

[16]  Jianxin Wu,et al.  Person Re-Identification with Correspondence Structure Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Nanning Zheng,et al.  Similarity Learning with Spatial Constraints for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Shaogang Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[21]  Zhongfei Zhang,et al.  Semantics-Aware Deep Correspondence Structure Learning for Robust Person Re-Identification , 2016, IJCAI.

[22]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Qi Tian,et al.  Query-adaptive late fusion for image search and person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[27]  Tao Xiang,et al.  Multi-Scale Learning for Low-Resolution Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Yang Gao,et al.  Compact Bilinear Pooling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Larry S. Davis,et al.  Multi-Task Learning with Low Rank Attribute Embedding for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Shengcai Liao,et al.  Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Xiaogang Wang,et al.  Joint Detection and Identification Feature Learning for Person Search , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[36]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[38]  Shuicheng Yan,et al.  Video-Based Person Re-Identification With Accumulative Motion Context , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[40]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[41]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Xiaogang Wang,et al.  Human Reidentification with Transferred Metric Learning , 2012, ACCV.

[43]  Amit K. Roy-Chowdhury,et al.  Temporal Model Adaptation for Person Re-identification , 2016, ECCV.

[44]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[46]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[48]  Liang Zheng,et al.  Improving Person Re-identification by Attribute and Identity Learning , 2017, Pattern Recognit..

[49]  Horst-Michael Groß,et al.  Appearance-Based 3D Upper-Body Pose Estimation and Person Re-identification on Mobile Robots , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[50]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Alfredo Gardel Vicente,et al.  Person Re-Identification Ranking Optimisation by Discriminant Context Information Analysis , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Xiaogang Wang,et al.  Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[55]  Rainer Stiefelhagen,et al.  Person Re-identification by Deep Learning Attribute-Complementary Information , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Nanning Zheng,et al.  Similarity learning on an explicit polynomial kernel feature map for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[58]  Rasmus Pagh,et al.  Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.

[59]  Gang Wang,et al.  Gated Siamese Convolutional Neural Network Architecture for Human Re-identification , 2016, ECCV.

[60]  Marco Cristani,et al.  Person Re-identification by Articulated Appearance Matching , 2014, Person Re-Identification.

[61]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Lin Wu,et al.  PersonNet: Person Re-identification with Deep Convolutional Neural Networks , 2016, ArXiv.

[63]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[65]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[66]  Baowen Xu,et al.  Super-resolution Person re-identification with semi-coupled low-rank discriminant dictionary learning , 2015, CVPR.

[67]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[68]  Victor S. Lempitsky,et al.  Multi-Region bilinear convolutional neural networks for person re-identification , 2015, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[69]  Jian-Huang Lai,et al.  Person Re-Identification by Camera Correlation Aware Feature Augmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[71]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Qi Tian,et al.  Scalable Person Re-identification on Supervised Smoothed Manifold , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Shiliang Zhang,et al.  Deep Attributes Driven Multi-Camera Person Re-identification , 2016, ECCV.

[76]  Slawomir Bak,et al.  Person Re-identification Using Spatial Covariance Regions of Human Body Parts , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[77]  Jian-Huang Lai,et al.  Deep Ranking for Person Re-Identification via Joint Representation Learning , 2015, IEEE Transactions on Image Processing.

[78]  Tao Xiang,et al.  Transferring a semantic representation for person re-identification and search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Xiaogang Wang,et al.  HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[80]  Shaogang Gong,et al.  Human-in-the-Loop Person Re-identification , 2016, ECCV.

[81]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[82]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Yu Liu,et al.  Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  S. Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[85]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Xiaogang Wang,et al.  End-to-End Deep Learning for Person Search , 2016, ArXiv.

[87]  Shaogang Gong,et al.  Unsupervised Cross-Dataset Transfer Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).