Deep feature learning with relative distance comparison for person re-identification

Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance. Its main difficulty lies in how to preserve similarity of the same person against large appearance and structure variation while discriminating different individuals. In this paper, we present a scalable distance driven feature learning framework based on the deep neural network for person re-identification, and demonstrate its effectiveness to handle the existing challenges. Specifically, given the training images with the class labels (person IDs), we first produce a large number of triplet units, each of which contains three images, i.e. one person with a matched reference and a mismatched reference. Treating the units as the input, we build the convolutional neural network to generate the layered representations, and follow with the L 2 distance metric. By means of parameter optimization, our framework tends to maximize the relative distance between the matched pair and the mismatched pair for each triplet unit. Moreover, a nontrivial issue arising with the framework is that the triplet organization cubically enlarges the number of training triplets, as one image can be involved into several triplet units. To overcome this problem, we develop an effective triplet generation scheme and an optimized gradient descent algorithm, making the computational load mainly depend on the number of original images instead of the number of triplets. On several challenging databases, our approach achieves very promising results and outperforms other state-of-the-art approaches. HighlightsWe present a novel feature learning framework for person re-identification.Our framework is based on the maximum relative distance comparison.The learning algorithm is scalable to process large amount of data.We demonstrate superior performances over other state-of-the-arts.

[1]  Jake Porway,et al.  A stochastic graph grammar for compositional object representation and recognition , 2009, Pattern Recognit..

[2]  Liang Lin,et al.  Human Re-identification by Matching Compositional Template with Cluster Sampling , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Ruimao Zhang,et al.  Adaptive Scene Category Discovery With Generative Learning and Compositional Sampling , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Dacheng Tao,et al.  Person Re-Identification Over Camera Networks Using Multi-Task Distance Metric Learning , 2014, IEEE Transactions on Image Processing.

[5]  Liang Lin,et al.  Layered Graph Matching with Composite Cluster Sampling , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Xiaogang Wang,et al.  Human Reidentification with Transferred Metric Learning , 2012, ACCV.

[8]  Sergio A. Velastin,et al.  Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[14]  Xiaogang Wang,et al.  Person Re-identification by Salience Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Liang Lin,et al.  Representing and recognizing objects with massive local image patches , 2012, Pattern Recognit..

[16]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[19]  Feiping Nie,et al.  Learning a Mahalanobis distance metric for data clustering and classification , 2008, Pattern Recognit..

[20]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[21]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[22]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Stan Z. Li,et al.  Deep Metric Learning for Practical Person Re-Identification , 2014, ArXiv.

[24]  Shaogang Gong,et al.  Towards Person Identification and Re-identification with Attributes , 2012, ECCV Workshops.

[25]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[26]  Shaogang Gong,et al.  Person re-identification by probabilistic relative distance comparison , 2011, CVPR 2011.

[27]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[28]  Jian-Huang Lai,et al.  Discriminatively Trained And-Or Graph Models for Object Shape Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Chunxiao Liu,et al.  Person Re-identification: What Features Are Important? , 2012, ECCV Workshops.

[30]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[31]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[32]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[35]  Qingming Huang,et al.  Set-label modeling and deep metric learning on person re-identification , 2015, Neurocomputing.

[36]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[37]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.