Deeply Associative Two-Stage Representations Learning Based on Labels Interval Extension Loss and Group Loss for Person Re-Identification

Person Re-identification (ReID) aims to match people across non-overlapping camera views in a public space, which is usually regarded as an image retrieval problem to match query images with pedestrian images in the gallery. It is challenging since many difficulties exist such as pose misalignments, occlusions, similar appearance when detecting people. Existing researches on ReID mainly focus on two major problems: representation learning and metric learning. In this paper, we target at learning discriminative representations and make two contributions in total. ( $i$ ) We propose a novel architecture named Deeply Associative Two-stage Representations Learning (DATRL). It contains the global re-initialization stage and fully-perceptual classification stage employing two identical CNNs associatively at the same time. On the global stage, we take on the backbone of one deep CNN e.g., dozens of layers in the front of Resnet-50 as a normal re-initialization subnetwork. Meanwhile, we apply our own proposed 3D-transpose technique into the backbone of the other CNN to form the 3D-transpose re-initialization subnetwork. The fully-perceptual stage is actually made up of the leftover layers of the original CNNs. On this stage, we take both the global representations learned at multiple hierarchies and the local representations uniformly-partitioned on the highest conv-layer into consideration, and then optimizing them separately for classification. ( $ii$ ) We introduce a new joint loss function in which our proposed Labels Interval Extension loss (LIEL) and Group loss (GL) are combined to enhance the performance of gradient decent as well as increasing the distances between image features with different identities. We apply the above DATRL, LIEL and GL to ReID thus obtaining DATRL-ReID. Experimental results on four datasets CUHK03, Market-1501, DukeMTMC-reID and MSMT17-V2 demonstrate that DATRL-ReID shows excellent performance in improving recognition accuracy and is superior to state-of-the-art methods.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Yu Liu,et al.  Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[5]  Toby P. Breckon,et al.  Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery , 2018, IEEE Transactions on Information Forensics and Security.

[6]  Cheng Wang,et al.  Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-Identification , 2018, ECCV.

[7]  Tao Xiang,et al.  Multi-level Factorisation Net for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Ajay Kumar,et al.  A CNN-Based Framework for Comparison of Contactless to Contact-Based Fingerprints , 2019, IEEE Transactions on Information Forensics and Security.

[9]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[10]  Shiguang Shan,et al.  Image to Video Person Re-Identification by Learning Heterogeneous Dictionary Pair With Feature Projection Matrix , 2018, IEEE Transactions on Information Forensics and Security.

[11]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[12]  Xiaogang Wang,et al.  Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xiaogang Wang,et al.  HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Shaogang Gong,et al.  Person Re-identification by Deep Learning Multi-scale Representations , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[17]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[18]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[19]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[20]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[22]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[23]  Longhui Wei,et al.  Person Transfer GAN to Bridge Domain Gap for Person Re-identification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Muhittin Gokmen,et al.  Human Semantic Parsing for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Anil K. Jain,et al.  Unconstrained Face Recognition: Identifying a Person of Interest From a Media Collection , 2014, IEEE Transactions on Information Forensics and Security.

[28]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Hantao Yao,et al.  Deep Representation Learning With Part Loss for Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[30]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Qi Tian,et al.  Scalable Person Re-identification on Supervised Smoothed Manifold , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Zhaoxiang Zhang,et al.  DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer , 2017, AAAI.

[34]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Shaogang Gong,et al.  Person Re-Identification by Deep Joint Learning of Multi-Loss Classification , 2017, IJCAI.

[36]  Liang Zheng,et al.  Improving Person Re-identification by Attribute and Identity Learning , 2017, Pattern Recognit..

[37]  Tao Mei,et al.  Part-Aligned Bilinear Representations for Person Re-identification , 2018, ECCV.

[38]  Rainer Stiefelhagen,et al.  Person Re-identification by Deep Learning Attribute-Complementary Information , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[39]  Yan Wang,et al.  Resource Aware Person Re-identification Across Multiple Resolutions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Yi Yang,et al.  Pedestrian Alignment Network for Large-scale Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[44]  Bernt Schiele,et al.  DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model , 2016, ECCV.

[45]  Tao Xiang,et al.  Multi-scale Deep Learning Architectures for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Q. Tian,et al.  GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval , 2017, ACM Multimedia.

[47]  Lin Wu,et al.  PersonNet: Person Re-identification with Deep Convolutional Neural Networks , 2016, ArXiv.

[48]  Jinjun Wang,et al.  Deep ranking model by large adaptive margin learning for person re-identification , 2018, Pattern Recognit..

[49]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[50]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[52]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Belhassen Bayar,et al.  Constrained Convolutional Neural Networks: A New Approach Towards General Purpose Image Manipulation Detection , 2018, IEEE Transactions on Information Forensics and Security.

[54]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[55]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[56]  Gang Wang,et al.  Gated Siamese Convolutional Neural Network Architecture for Human Re-identification , 2016, ECCV.

[57]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[61]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[64]  Jian Sun,et al.  AlignedReID: Surpassing Human-Level Performance in Person Re-Identification , 2017, ArXiv.

[65]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[66]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[67]  Patrizio Campisi,et al.  Convolutional Neural Network for Finger-Vein-Based Biometric Identification , 2019, IEEE Transactions on Information Forensics and Security.