Gated Siamese Convolutional Neural Network Architecture for Human Re-identification

Matching pedestrians across multiple camera views, known as human re-identification, is a challenging research problem that has numerous applications in visual surveillance. With the resurgence of Convolutional Neural Networks (CNNs), several end-to-end deep Siamese CNN architectures have been proposed for human re-identification with the objective of projecting the images of similar pairs (i.e. same identity) to be closer to each other and those of dissimilar pairs to be distant from each other. However, current networks extract fixed representations for each image regardless of other images which are paired with it and the comparison with other images is done only at the final level. In this setting, the network is at risk of failing to extract finer local patterns that may be essential to distinguish positive pairs from hard negative pairs. In this paper, we propose a gating function to selectively emphasize such fine common local patterns by comparing the mid-level features across pairs of images. This produces flexible representations for the same image according to the images they are paired with. We conduct experiments on the CUHK03, Market-1501 and VIPeR datasets and demonstrate improved performance compared to a baseline Siamese CNN architecture.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[3]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[4]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[5]  Xiaogang Wang,et al.  Person Re-identification by Salience Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Fei Xiong,et al.  Person Re-Identification Using Kernel-Based Metric Learning Methods , 2014, ECCV.

[7]  Larry S. Davis,et al.  Multi-Task Learning with Low Rank Attribute Embedding for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Shengcai Liao,et al.  Color Models and Weighted Covariance Estimation for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[10]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Venkatesh Saligrama,et al.  A Novel Visual Word Co-occurrence Model for Person Re-identification , 2014, ECCV Workshops.

[12]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Harm de Vries,et al.  RMSProp and equilibrated adaptive learning rates for non-convex optimization. , 2015 .

[14]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[15]  Gang Wang,et al.  Hierarchical Invariant Feature Learning with Marginalization for Person Re-Identification , 2015, ArXiv.

[16]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Qi Tian,et al.  Query-adaptive late fusion for image search and person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[23]  Jiwen Lu,et al.  Learning Invariant Color Features for Person Reidentification , 2014, IEEE Transactions on Image Processing.

[24]  Nanning Zheng,et al.  Similarity Learning with Spatial Constraints for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[26]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[28]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[29]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[30]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[31]  Shengcai Liao,et al.  Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[33]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[40]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[41]  Jianxin Wu,et al.  Person Re-Identification with Correspondence Structure Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Sergio A. Velastin,et al.  Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Tao Xiang,et al.  Transferring a semantic representation for person re-identification and search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Xiaogang Wang,et al.  Human Reidentification with Transferred Metric Learning , 2012, ACCV.

[48]  Matti Pietikäinen,et al.  Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Gang Wang,et al.  Integrating parametric and non-parametric models for scene labeling , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[52]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[55]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Huchuan Lu,et al.  Sample-Specific SVM Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).