Feature Aggregation With Reinforcement Learning for Video-Based Person Re-Identification

Video-based person re-identification (re-id) matches two tracks of persons from different cameras. Features are extracted from the images of a sequence and then aggregated as a track feature. Compared to existing works that aggregate frame features by simply averaging them or using temporal models such as recurrent neural networks, we propose an intelligent feature aggregate method based on reinforcement learning. Specifically, we train an agent to determine which frames in the sequence should be abandoned in the aggregation, which can be treated as a decision making process. By this way, the proposed method avoids introducing noisy information of the sequence and retains these valuable frames when generating a track feature. On benchmark data sets, experimental results show that our method can boost the re-id accuracy obviously based on the state-of-the-art models.

[1]  Hussein A. Abbass,et al.  Hierarchical Deep Reinforcement Learning for Continuous Action Control , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Xu Lan,et al.  Deep Reinforcement Learning Attention Selection For Person Re-Identification , 2017, BMVC.

[3]  Yu Liu,et al.  Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Kan Liu,et al.  Learning Compact Appearance Representation for Video-Based Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Shengping Zhang,et al.  Robust Collaborative Discriminative Learning for RGB-Infrared Tracking , 2018, AAAI.

[7]  Yun Fu,et al.  Support Neighbor Loss for Person Re-Identification , 2018, ACM Multimedia.

[8]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Pong C. Yuen,et al.  Robust Anchor Embedding for Unsupervised Video Person re-IDentification in the Wild , 2018, ECCV.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Xiangyang Ji,et al.  Learning Intra-Video Difference for Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Qi Tian,et al.  Part-Based Deep Hashing for Large-Scale Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[13]  Xiaogang Wang,et al.  Video Person Re-identification with Competitive Snippet-Similarity Aggregation and Co-attentive Snippet Embedding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Ming Shao,et al.  Cross-View Projective Dictionary Learning for Person Re-Identification , 2015, IJCAI.

[15]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[16]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Bingpeng Ma,et al.  Video-Based Pedestrian Re-Identification by Adaptive Spatio-Temporal Appearance Model , 2017, IEEE Transactions on Image Processing.

[18]  Ming Shao,et al.  Person Re-Identification by Cross-View Multi-Level Dictionary Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Changxin Gao,et al.  Superpixel-Based Temporally Aligned Representation for Video-Based Person Re-Identification † , 2019, Sensors.

[20]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Rama Chellappa,et al.  Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker , 2018, IEEE Transactions on Image Processing.

[22]  Xiaodong Yu,et al.  Learning Bidirectional Temporal Cues for Video-Based Person Re-Identification , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Gang Wang,et al.  Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Murray Shanahan,et al.  Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Zheng Wang,et al.  Video-Based Person Re-Identification via Self Paced Weighting , 2018, AAAI.

[26]  Wei Xing Zheng,et al.  Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[29]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Wei Zhang,et al.  Coarse-to-Fine UAV Target Tracking With Deep Reinforcement Learning , 2019, IEEE Transactions on Automation Science and Engineering.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[34]  Liqing Zhang,et al.  Multi-shot Pedestrian Re-identification via Sequential Decision Making , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.