A spatial and temporal features mixture model with body parts for video-based person re-identification

The goal of video-based person re-identification is to recognize a person at different camera settings. Most previous methods use features from the full body to represent a person. In this paper, we propose a novel Spatial and Temporal Features Mixture Model (STFMM). Unlike previous approaches, our model first horizontally splits human body into N parts, which include the information of head, waist, legs and so on. The feature of each part is then integrated in order to achieve more expressive representation for each person. Experiments conducted on the iLIDS-VID and PRID-2011 datasets demonstrate that our approach outperforms the existing video-based person re-identification methods and significantly improves stability. Our model achieves a rank-1 CMC accuracy of 73.6% on the iLIDS-VID dataset and a rank-1 CMC accuracy of 47.8% for the cross-data testing.

[1]  Anton van den Hengel,et al.  Learning to rank in person re-identification with metric ensembles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Horst Bischof,et al.  Relaxed Pairwise Learned Metric for Person Re-identification , 2012, ECCV.

[4]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[5]  Shengcai Liao,et al.  Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Fei Xiong,et al.  Person Re-Identification Using Kernel-Based Metric Learning Methods , 2014, ECCV.

[7]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[8]  Bingpeng Ma,et al.  A Spatio-Temporal Appearance Representation for Video-Based Pedestrian Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[10]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[11]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Venkatesh Saligrama,et al.  Group Membership Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[15]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Chunxiao Liu,et al.  Person Re-identification: What Features Are Important? , 2012, ECCV Workshops.

[18]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[19]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[21]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[22]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[24]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Lin Wu,et al.  Deep Recurrent Convolutional Networks for Video-based Person Re-identification: An End-to-End Approach , 2016, ArXiv.

[29]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[31]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Tapani Raiko,et al.  International Conference on Learning Representations (ICLR) , 2016 .

[33]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Yang Li,et al.  Multi-Shot Human Re-Identification Using Adaptive Fisher Discriminant Analysis , 2015, BMVC.

[35]  Anurag Mittal,et al.  Deep Neural Networks with Inexact Matching for Person Re-Identification , 2016, NIPS.

[36]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.