Introducing Scene Understanding to Person Re-Identification using a Spatio-Temporal Multi-Camera Model

In this paper, we investigate person re-identification (re-ID) in a multi-camera network for surveillance applications. To this end, we create a Spatio-Temporal Multi-Camera model (ST-MC model), which exploits statistical data on a person’s entry/exit points in the multi-camera network, to predict in which camera view a person will re-appear. The created ST-MC model is used as a novel extension to the Multiple Granularity Network (MGN) [1], which is the current state of the art in person re-ID. Compared to existing approaches that are solely based on Convolutional Neural Networks (CNNs), our approach helps to improve the re-ID performance by considering not only appearance-based features of a person from a CNN, but also contextual information. The latter serves as scene understanding information complimentary to person re-ID. Experimental results show that for the DukeMTMC-reID dataset [2][3], introduction of our ST-MC model substantially increases the mean Average Precision (mAP) and Rank-1 score from 77.2% to 84.1%, and from 88.6% to 96.2%, respectively.

[1]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.

[2]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[4]  Trevor Darrell,et al.  Simultaneous calibration and tracking with a network of non-overlapping sensors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5]  Zhedong Zheng,et al.  Joint Discriminative and Generative Learning for Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yang Yang,et al.  ABD-Net: Attentive but Diverse Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Jian-Huang Lai,et al.  Spatial-Temporal Person Re-identification , 2018, AAAI.

[8]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  S. Rigatti Random Forest. , 2017, Journal of insurance medicine.

[10]  Yunchao Wei,et al.  Horizontal Pyramid Matching for Person Re-identification , 2018, AAAI.

[11]  Jian Sun,et al.  AlignedReID: Surpassing Human-Level Performance in Person Re-Identification , 2017, ArXiv.

[12]  Ramakant Nevatia,et al.  Camera calibration from video of a walking human , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[14]  Kuk-Jin Yoon,et al.  Distance-based Camera Network Topology Inference for Person Re-identification , 2019, Pattern Recognit. Lett..

[15]  Yifan Sun,et al.  SVDNet for Pedestrian Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Yoo-Joo Choi,et al.  Learning Spatio-Temporal Topology of a Multi-Camera Network by Tracking Multiple People , 2007 .

[17]  Hao Helen Zhang,et al.  Hard or Soft Classification? Large-Margin Unified Machines , 2011, Journal of the American Statistical Association.

[18]  F. Chang,et al.  Topology Learning of Non-overlapping Multi-camera Network , 2015 .

[19]  Mingjie Ma,et al.  A simplified nonlinear regression method for human height estimation in video surveillance , 2015, EURASIP J. Image Video Process..

[20]  Gérard G. Medioni,et al.  Exploring context information for inter-camera multiple target tracking , 2014, IEEE Winter Conference on Applications of Computer Vision.

[21]  Naila Murray,et al.  Re-ID done right: towards good practices for person re-identification , 2018, ArXiv.

[22]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[23]  Shishir K. Shah,et al.  A survey of approaches and trends in person re-identification , 2014, Image Vis. Comput..

[24]  Wei Jiang,et al.  Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Bernt Schiele,et al.  Parameter-Free Spatial Attention Network for Person Re-Identification , 2018, ArXiv.

[26]  Prajjwal Bhargava Incremental Learning in Person Re-Identification , 2018, ArXiv.

[27]  Gian Luca Foresti,et al.  Person Reidentification in a Distributed Camera Network Framework , 2017, IEEE Transactions on Cybernetics.

[28]  Qi Tian,et al.  Beyond Part Models: Person Retrieval with Refined Part Pooling , 2017, ECCV.

[29]  Andrea Cavallaro,et al.  Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Slawomir Bak,et al.  Person re-identification by pose priors , 2015, Electronic Imaging.