Discriminative feature extraction for video person re-identification via multi-task network

The goal of video-based person re-identification is to match different pedestrians in various image sequences across non-overlapping cameras. A critical issue of this task is how to exploit the useful information provided by videos. To solve this problem, we propose a novel feature learning framework for video-based person re-identification. The proposed method aims at capturing the most significant information in the spatial and temporal domains and then building a discriminative and robust feature representation for each sequence. More specifically, to learn more effective frame-wise features, we apply several attributes to the video-based task and build a multi-task network for the identity and attribute classifications. In the training phase, we present a multi-loss function to minimize intra-class variances and maximize inter-class differences. After that, the feature aggregation network is employed to aggregate frame-wise features and extract the temporal information from the video. Furthermore, considering that adjacent frames typically have similar appearance features, we propose the concept of “non-redundant appearance feature extraction” to obtain the sequence-level appearance descriptors of pedestrians. Based on the complementarity between the temporal feature and the non-redundant appearance feature, we combine them in the distance learning phase by assigning them different distance-weighted coefficients. Extensive experiments are conducted on three video-based datasets and the results demonstrate the superiority and effectiveness of our method.

[1]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Shengcai Liao,et al.  Multi-label convolutional neural network based pedestrian attribute classification , 2017, Image Vis. Comput..

[3]  Zheng Liu,et al.  Hierarchical Integration of Rich Features for Video-Based Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Yiqiang Chen,et al.  Deep and low-level feature based attribute learning for person re-identification , 2018, Image Vis. Comput..

[5]  Majid Masoumi,et al.  A blind scene-based watermarking for video copyright protection , 2013 .

[6]  Liang Zheng,et al.  Improving Person Re-identification by Attribute and Identity Learning , 2017, Pattern Recognit..

[7]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Huchuan Lu,et al.  Video Person Re-Identification by Temporal Residual Learning , 2018, IEEE Transactions on Image Processing.

[10]  Jian-Huang Lai,et al.  Mirror Representation for Modeling View-Specific Transform in Person Re-Identification , 2015, IJCAI.

[11]  Longhui Wei,et al.  GLAD: Global–Local-Alignment Descriptor for Scalable Person Re-Identification , 2019, IEEE Transactions on Multimedia.

[12]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Horst Bischof,et al.  Mahalanobis Distance Learning for Person Re-identification , 2014, Person Re-Identification.

[14]  Shuicheng Yan,et al.  Clothing Attributes Assisted Person Reidentification , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Matti Pietikäinen,et al.  Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[17]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Shaogang Gong,et al.  Person Re-identification by Attributes , 2012, BMVC.

[19]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[20]  Feng Liu,et al.  A Two-Stage Attribute-Constraint Network for Video-Based Person Re-Identification , 2019, IEEE Access.

[21]  Shaogang Gong,et al.  Person Re-Identification , 2014 .

[22]  Jian-Huang Lai,et al.  Deep Ranking for Person Re-Identification via Joint Representation Learning , 2015, IEEE Transactions on Image Processing.

[23]  Jie Liu,et al.  A spatial and temporal features mixture model with body parts for video-based person re-identification , 2019, Applied Intelligence.

[24]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Ping Li,et al.  Improving person re-identification by multi-task learning , 2019, Multimedia Tools and Applications.

[26]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[27]  Chen Change Loy,et al.  Person Re-Identification , 2014, Advances in Computer Vision and Pattern Recognition.

[28]  Feng Liu,et al.  Video-based person re-identification using a novel feature extraction and fusion technique , 2020, Multimedia Tools and Applications.

[29]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Gang Wang,et al.  Part-based Tracking via Discriminative Correlation Filters , 2017 .

[31]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[32]  Tetsu Matsukawa,et al.  Person re-identification using CNN features learned from combination of attributes , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[33]  Shuicheng Yan,et al.  Video-Based Person Re-Identification With Accumulative Motion Context , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[35]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[36]  Takahiro Okabe,et al.  Hierarchical Gaussian Descriptor for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Zenghui Zhang,et al.  Discriminative representation learning for person re-identification via multi-loss training , 2019, J. Vis. Commun. Image Represent..

[40]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[42]  Ramakant Nevatia,et al.  Revisiting Temporal Modeling for Video-based Person ReID , 2018, ArXiv.

[43]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[44]  Shaogang Gong,et al.  Person Re-Identification by Discriminative Selection in Video Ranking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Rui Yu,et al.  Deep-Person: Learning Discriminative Deep Features for Person Re-Identification , 2017, Pattern Recognit..

[46]  Shaogang Gong,et al.  Person re-identification by probabilistic relative distance comparison , 2011, CVPR 2011.

[47]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[48]  Xiang Li,et al.  Top-Push Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Yu Liu,et al.  Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Xiao-Ping Zhang,et al.  A novel deep model with multi-loss and efficient training for person re-identification , 2019, Neurocomputing.

[51]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[52]  Jiwen Lu,et al.  Spatial-Temporal Attention-Aware Learning for Video-Based Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[53]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Bingpeng Ma,et al.  A Spatio-Temporal Appearance Representation for Video-Based Pedestrian Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[56]  Shiliang Zhang,et al.  Multi-type attributes driven multi-camera person re-identification , 2018, Pattern Recognit..