Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Multi-shot person re-identification (ReID) is a popular case of person ReID in which a set of images are processed for each person. However, using entire image set for person ReID as most experimented proposals is not always effective because of time and memory consuming. The main contribution of this work is the proposed strategies for (1) choosing representative image frames for each individual instead of entire set of frames, and (2) temporal feature pooling in multi-shot person ReID. These strategies are efficiently integrated in a person ReID framework which uses GoG (Gaussian of Gaussian) and XQDA (metric learning Cross-view Quadratic Discriminant Analysis) for person representation and matching. The effectiveness of the proposed framework on two benchmark datasets (PRID 2011 and iLIDS-VID) in terms of re-identification accuracy, computational time, and storage requirements are deeply investigated and analyzed. The experimental results allow to provide several recommendations on the use of these schemes based on the characteristics of the working dataset and the requirement of the applications. Furthermore, the study also offers a desktop-based application for person search and ReID. The implementation of the proposed framework will be made publicly available.

[1]  Yi Liu,et al.  Re-ranking pedestrian re-identification with multiple Metrics , 2018, Multimedia Tools and Applications.

[2]  Huchuan Lu,et al.  Stepwise Metric Promotion for Unsupervised Video Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, CVPR.

[4]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Thi-Lan Le,et al.  Fusion schemes for image-to-video person re-identification , 2019, J. Inf. Telecommun..

[6]  Michael Lindenbaum,et al.  Learning Implicit Transfer for Person Re-identification , 2012, ECCV Workshops.

[7]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[8]  Yi Yang,et al.  Two-Stream Multirate Recurrent Neural Network for Video-Based Pedestrian Reidentification , 2018, IEEE Transactions on Industrial Informatics.

[9]  Takahiro Okabe,et al.  Supplementary Material for Hierarchical Gaussian Descriptor for Person Re-Identification , 2016 .

[10]  Mohamed Hammami,et al.  Key Frame Selection for Multi-shot Person Re-identification , 2016 .

[11]  Xiaoshuai Sun,et al.  Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length , 2018, IEEE Transactions on Multimedia.

[12]  Zonghai Chen,et al.  Multi-feature fusion based re-ranking for person re-identification , 2016, 2016 International Conference on Audio, Language and Image Processing (ICALIP).

[13]  Kan Liu,et al.  Learning Compact Appearance Representation for Video-Based Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Jin Wang,et al.  Temporally aligned pooling representation for video-based person re-identification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[15]  Donald E. Brown,et al.  An Improvement of Data Classification Using Random Multimodel Deep Learning (RMDL) , 2018, International Journal of Machine Learning and Computing.

[16]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[17]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[20]  François Brémond,et al.  Appearance based retrieval for tracked objects in surveillance videos , 2009, CIVR '09.

[21]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Mohamed Jallouli,et al.  Images Selection and Best Descriptor Combination for Multi-shot Person Re-identification , 2017, IIMSS.

[23]  Shang-Hong Lai,et al.  Single-shot person re-identification based on improved Random-Walk pedestrian segmentation , 2012, 2012 International Symposium on Intelligent Signal Processing and Communications Systems.

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Li Yuan,et al.  Person Re-identification Based on Color and Texture Feature Fusion , 2016, ICIC.

[26]  Shaogang Gong,et al.  Unsupervised Tracklet Person Re-Identification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Masayuki Mukunoki,et al.  Locality-Constrained Collaborative Sparse Approximation for Multiple-Shot Person Re-identification , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[28]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Pong C. Yuen,et al.  Dynamic Label Graph Matching for Unsupervised Video Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Shengmei Shen,et al.  Person re-identification with fusion of hand-crafted and deep pose-based body region features , 2018, ArXiv.

[31]  Larry S. Davis,et al.  Multi-Task Learning with Low Rank Attribute Embedding for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Masayuki Mukunoki,et al.  Locality-constrained Collaboratively Regularized Nearest Points for Multiple-shot Person Re-identication , 2014 .

[33]  Qi Tian,et al.  Query-adaptive late fusion for image search and person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Xiaogang Wang,et al.  Person Re-identification by Salience Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[35]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Haizhou Ai,et al.  A feature fusion strategy for person re-identification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[37]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[38]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.

[39]  Shaogang Gong,et al.  Unsupervised Cross-Dataset Transfer Learning for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[41]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[42]  Ziyan Wu,et al.  A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Chang Tian,et al.  Person Re-identification with Hierarchical Deep Learning Feature and efficient XQDA Metric , 2018, ACM Multimedia.

[44]  Vittorio Murino,et al.  Symmetry-driven accumulation of local features for human characterization and re-identification , 2013, Comput. Vis. Image Underst..

[45]  Yahong Han,et al.  Multi-cue fusion: Discriminative enhancing for person re-identification , 2019, J. Vis. Commun. Image Represent..

[46]  Shaogang Gong,et al.  Unsupervised Person Re-identification by Deep Learning Tracklet Association , 2018, ECCV.

[47]  Mohamed Jallouli,et al.  Multi-shot person re-identification approach based key frame selection , 2015, International Conference on Machine Vision.

[48]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[49]  Thomas B. Moeslund,et al.  Enhancing person re-identification by late fusion of low-, mid- and high-level features , 2018, IET Biom..

[50]  Xilin Chen,et al.  Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[52]  Zheng Liu,et al.  A fast adaptive spatio-temporal 3D feature for video-based person re-identification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[53]  Thi-Lan Le,et al.  Enhancing Person Re-Identification Based on Recurrent Feature Aggregation Network , 2018, 2018 1st International Conference on Multimedia Analysis and Pattern Recognition (MAPR).

[54]  Bingpeng Ma,et al.  A Spatio-Temporal Appearance Representation for Video-Based Pedestrian Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Masayuki Mukunoki,et al.  Set Based Discriminative Ranking for Recognition , 2012, ECCV.

[56]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Horst-Michael Groß,et al.  Evaluation of multi feature fusion at score-level for appearance-based person re-identification , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[58]  Ngai-Man Cheung,et al.  Deep Adaptive Temporal Pooling for Activity Recognition , 2018, ACM Multimedia.

[59]  Shaogang Gong,et al.  Person Re-Identification by Unsupervised Video Matching , 2016, Pattern Recognit..

[60]  Qi Tian,et al.  Pooling the Convolutional Layers in Deep ConvNets for Video Action Recognition , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[61]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[63]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[64]  Shaogang Gong,et al.  Deep Association Learning for Unsupervised Video Person Re-identification , 2018, BMVC.

[65]  Xuelong Li,et al.  From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[66]  Qi Tian,et al.  Video-Based Person Re-identification by Deep Feature Guided Pooling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[67]  Shaogang Gong,et al.  Person Re-Identification by Discriminative Selection in Video Ranking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Meng Wang,et al.  Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder , 2018, IEEE Transactions on Image Processing.

[69]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[70]  Nicu Sebe,et al.  Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation , 2016, IEEE Transactions on Image Processing.

[71]  Shuicheng Yan,et al.  Video-Based Person Re-Identification With Accumulative Motion Context , 2017, IEEE Transactions on Circuits and Systems for Video Technology.