Combining multilevel feature extraction and multi-loss learning for person re-identification

Abstract The goal of person re-identification (re-id) is to match images of the same person captured by multiple cameras with non-overlapping views. It is a challenging task due to the large spatial displacement and human pose change of person images across different views. Recently, the deep Convolutional Neural Network (CNN) has significantly improved the performance of person re-id. In this paper, we present a hybrid deep model that combines multilevel feature extraction and multi-loss learning for more robust pedestrian descriptors. The multi-loss function jointly optimizes the verification task that aims to verify if two images belong to same person, and the recognition task that aims to predict the identity of each image. Specifically, given two person images, we first apply a deep learning network, called Feature Aggregation Network (FAN), to extract their multilevel CNN features by fusing the information of different layers. For the verification task, a Recurrent Comparative Network (RCN) is presented to learn joint representation of paired CNN features. RCN determines whether two images depict the same person through focusing on discriminative regions and alternatively comparing their appearance. It is an algorithmic imitation of human decision-making process, in which a person repeatedly compares two objects before making decision about their similarity. For the recognition task, a parameter-free operation termed Global Average Pooling (GAP) is followed after each CNN feature to extract identity-related features. Extensive experiments are conducted on four datasets, including CUHK03, CUHK01, Market1501 and DukeMTMC, and the experimental results demonstrate the effectiveness of our presented method.

[1]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[2]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[3]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[4]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[5]  H Moon,et al.  Computational and Performance Aspects of PCA-Based Face-Recognition Algorithms , 2001, Perception.

[6]  Xiaodong Yu,et al.  Learning Bidirectional Temporal Cues for Video-Based Person Re-Identification , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Jian-Huang Lai,et al.  Deep Ranking for Person Re-Identification via Joint Representation Learning , 2015, IEEE Transactions on Image Processing.

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Zhedong Zheng,et al.  CamStyle: A Novel Data Augmentation Method for Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[10]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[12]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[13]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.