Image Analysis and Processing – ICIAP 2019: 20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I

In this paper we consider the problem of video-based person re-identification, which is the task of associating videos of the same person captured by different and non-overlapping cameras. We propose a Siamese framework in which video frames of the person to re-identify and of the candidate one are processed by two identical networks which produce a similarity score. We introduce an attention mechanisms to capture the relevant information both at frame level (spatial information) and at video level (temporal information given by the importance of a specific frame within the sequence). One of the novelties of our approach is given by a joint concurrent processing of both frame and video levels, providing in such a way a very simple architecture. Despite this fact, out approach achieves better performance than the state-of-the-art on the challenging iLIDS-VID dataset.

[1]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[2]  Kan Liu,et al.  Learning Compact Appearance Representation for Video-Based Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[4]  Alberto Del Bimbo,et al.  From person to group re-identification via unsupervised transfer of sparse features , 2019, Image Vis. Comput..

[5]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[6]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[7]  Rita Cucchiara,et al.  People reidentification in surveillance and forensics , 2013, ACM Comput. Surv..

[8]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Xiao-Yuan Jing,et al.  Video-Based Person Re-Identification by Simultaneously Learning Intra-Video and Inter-Video Distance Metrics , 2016, IEEE Transactions on Image Processing.

[10]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[11]  Niki Martinel,et al.  Accelerated low-rank sparse metric learning for person re-identification , 2018, Pattern Recognit. Lett..

[12]  Shuicheng Yan,et al.  End-to-End Comparative Attention Networks for Person Re-Identification , 2016, IEEE Transactions on Image Processing.

[13]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[14]  Louahdi Khoudour,et al.  Video Sequences Association for People Re-identification across Multiple Non-overlapping Cameras , 2009, ICIAP.

[15]  Gian Luca Foresti,et al.  A neural tree for classification using convex objective function , 2015, Pattern Recognit. Lett..

[16]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.