Video-based person re-identification using a novel feature extraction and fusion technique

Person re-identification has received extensive attention in the academic community. In this paper, a novel multiple feature fusion network (MPFF-Net) is proposed for video-based person re-identification. The proposed network is used to obtain the robust and discriminative feature representation for describing the pedestrian in the video, which contains the hand-crafted and deep-learned parts. First, the image-level features of all consecutive frames are extracted. Then the hand-crafted branch uses these descriptors to obtain the average feature of the video and the information of frame-to-frame differences. The deep-learned branch is based on the bidirectional LSTM (BiLSTM) network. It is responsible for aggregating frame-wise representations of human regions and yielding sequence-level features. Furthermore, the problem of misalignment is taken into account in this branch. Finally, the hand-crafted and deep-learned parts are considered to be complementary, and the fusion of them can help to capture the complete information of the video. Extensive experiments are conducted on the iLIDS-VID, PRID2011 and MARS datasets. The results demonstrate that the proposed algorithm outperforms state-of-the-art video-based re-identification methods.

[1]  Gang Wang,et al.  Part-based Tracking via Discriminative Correlation Filters , 2017 .

[2]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yong Luo,et al.  Multiview Matrix Completion for Multilabel Image Classification , 2015, IEEE Transactions on Image Processing.

[5]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Yong Luo,et al.  Tensor Canonical Correlation Analysis for Multi-View Dimension Reduction , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Yong Luo,et al.  Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification , 2019, IEEE Transactions on Image Processing.

[9]  Yahong Han,et al.  Multi-cue fusion: Discriminative enhancing for person re-identification , 2019, J. Vis. Commun. Image Represent..

[10]  Chen Change Loy,et al.  Person Re-Identification , 2014, Advances in Computer Vision and Pattern Recognition.

[11]  Xiang Li,et al.  Top-Push Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Wei Zhang,et al.  Feature Aggregation With Reinforcement Learning for Video-Based Person Re-Identification , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[15]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Zheng Liu,et al.  Hierarchical Integration of Rich Features for Video-Based Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[18]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[19]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[20]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[21]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[22]  Nanning Zheng,et al.  Discriminative Feature Learning With Foreground Attention for Person Re-Identification , 2018, IEEE Transactions on Image Processing.

[23]  Yu Zhou,et al.  No-Reference Quality Assessment for View Synthesis Using DoG-Based Edge Statistics and Texture Naturalness , 2019, IEEE Transactions on Image Processing.

[24]  Cheolkon Jung,et al.  Deep feature embedding learning for person re-identification based on lifted structured loss , 2018, Multimedia Tools and Applications.

[25]  Alan L. Yuille,et al.  Semi-Supervised Sparse Representation Based Classification for Face Recognition With Insufficient Labeled Samples , 2016, IEEE Transactions on Image Processing.

[26]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[27]  Zhi Zhang,et al.  Supervised Deep Feature Embedding With Handcrafted Feature , 2019, IEEE Transactions on Image Processing.

[28]  Yi Yang,et al.  Pedestrian Alignment Network for Large-scale Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jian Guo,et al.  Person re-identification using salient region matching game , 2017, Multimedia Tools and Applications.

[31]  Bingpeng Ma,et al.  A Spatio-Temporal Appearance Representation for Video-Based Pedestrian Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[33]  Xiao-Yuan Jing,et al.  Video-Based Person Re-Identification by Simultaneously Learning Intra-Video and Inter-Video Distance Metrics , 2016, IEEE Transactions on Image Processing.

[34]  Zheng Wang,et al.  Video-Based Person Re-Identification via Self Paced Weighting , 2018, AAAI.

[35]  Chokri Ben Amar,et al.  Deep salient-Gaussian Fisher vector encoding of the spatio-temporal trajectory structures for person re-identification , 2018, Multimedia Tools and Applications.

[36]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Andrea Cavallaro,et al.  Omni-Scale Feature Learning for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[39]  Shuicheng Yan,et al.  Video-Based Person Re-Identification With Accumulative Motion Context , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[41]  Shishir K. Shah,et al.  A survey of approaches and trends in person re-identification , 2014, Image Vis. Comput..

[42]  Shaogang Gong,et al.  Person re-identification by probabilistic relative distance comparison , 2011, CVPR 2011.

[43]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Chunxiao Liu,et al.  Person Re-identification: What Features Are Important? , 2012, ECCV Workshops.

[45]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Fengyuan Wang,et al.  Engineering Hand-designed and Deeply-learned features for person Re-identification , 2020, Pattern Recognit. Lett..

[47]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Yong Luo,et al.  Multi-View Matrix Completion for Multi-Label Image Classification , 2019, ArXiv.

[49]  D. Hatzinakos,et al.  Gait recognition: a challenging signal processing technology for biometric identification , 2005, IEEE Signal Processing Magazine.

[50]  Huchuan Lu,et al.  Video Person Re-Identification by Temporal Residual Learning , 2018, IEEE Transactions on Image Processing.

[51]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.