Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-identification

We address the problem of person re-identification from commodity depth sensors. One challenge for depth-based recognition is data scarcity. Our first contribution addresses this problem by introducing split-rate RGB-to-Depth transfer, which leverages large RGB datasets more effectively than popular fine-tuning approaches. Our transfer scheme is based on the observation that the model parameters at the bottom layers of a deep convolutional neural network can be directly shared between RGB and depth data while the remaining layers need to be fine-tuned rapidly. Our second contribution enhances re-identification for video by implementing temporal attention as a Bernoulli-Sigmoid unit acting upon frame-level features. Since this unit is stochastic, the temporal attention parameters are trained using reinforcement learning. Extensive experiments validate the accuracy of our method in person re-identification from depth sequences. Finally, in a scenario where subjects wear unseen clothes, we show large performance gains compared to a state-of-the-art model which relies on RGB data.

[1]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Wei-Shi Zheng,et al.  Cross-View Asymmetric Metric Learning for Unsupervised Person Re-Identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Xiaogang Wang,et al.  Joint Detection and Identification Feature Learning for Person Search , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Wei Zeng,et al.  Silhouette-Based Gait Recognition via Deterministic Learning , 2013, BICS.

[6]  S. Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[7]  Yao Li,et al.  Sequential Person Recognition in Photo Albums with a Recurrent Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Thomas B. Moeslund,et al.  Multimodal person re-identification using RGB-D sensors and a transient identification database , 2013, 2013 International Workshop on Biometrics and Forensics (IWBF).

[9]  Alberto Del Bimbo,et al.  Person Re-Identification by Iterative Re-Weighted Sparse Ranking , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Aristidis Likas,et al.  A Reinforcement Learning Approach to Online Clustering , 1999, Neural Computation.

[11]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[12]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[13]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Edward J. Delp,et al.  A Two Stream Siamese Convolutional Neural Network for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[16]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Huchuan Lu,et al.  Stepwise Metric Promotion for Unsupervised Video Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Kuk-Jin Yoon,et al.  Improving Person Re-identification via Pose-Aware Multi-shot Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Rama Chellappa,et al.  Gait Analysis for Human Identification , 2003, AVBPA.

[25]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yasushi Makihara,et al.  Gait Recognition under Speed Transition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Shengcai Liao,et al.  Embedding Deep Metric for Person Re-identification: A Study Against Large Variations , 2016, ECCV.

[28]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[29]  Manuel J. Marín-Jiménez,et al.  Automatic Learning of Gait Signatures for People Identification , 2016, IWANN.

[30]  Andreas Stafylopatis,et al.  Enhancing Stochasticity in Reinforcement Learning Schemes: Application to the Exploration of Binary Domains , 1995 .

[31]  Jiahuan Zhou,et al.  Efficient Online Local Metric Adaptation via Negative Samples for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Tao Xiang,et al.  Multi-scale Deep Learning Architectures for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Hao Guo,et al.  Learning View-Invariant Features for Person Identification in Temporally Synchronized Videos Taken by Wearable Cameras , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Slawomir Bak,et al.  One-Shot Metric Learning for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yu Cheng,et al.  Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Ling Shao,et al.  Fast Person Re-identification via Cross-Camera Semantic Binary Transformation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  François Charpillet,et al.  A gait analysis method based on a depth camera for fall prevention , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[38]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[39]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Rafael Medina Carnicer,et al.  Pyramidal Fisher Motion for Multiview Gait Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[41]  Nanning Zheng,et al.  Similarity Learning with Spatial Constraints for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[43]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.

[44]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Shengcai Liao,et al.  Salient Color Names for Person Re-identification , 2014, ECCV.

[46]  J. M. Mossi,et al.  Who is who at different cameras: people re-identification using depth cameras , 2012 .

[47]  Sergio A. Velastin,et al.  Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Jiwen Lu,et al.  Consistent-Aware Deep Learning for Person Re-identification in a Camera Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Shaogang Gong,et al.  Reidentification by Relative Distance Comparison , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[53]  Fei Xiong,et al.  Person Re-Identification Using Kernel-Based Metric Learning Methods , 2014, ECCV.

[54]  Ehud Rivlin,et al.  Color Invariants for Person Reidentification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[56]  Shaogang Gong,et al.  Person Re-Identification by Unsupervised \ell _1 ℓ 1 Graph Learning , 2016, ECCV.

[57]  Song Wang,et al.  Person Identification Using Full-Body Motion and Anthropometric Biometrics from Kinect Videos , 2012, ECCV Workshops.

[58]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[59]  Xiaogang Wang,et al.  Person Re-identification by Salience Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[60]  Bingbing Ni,et al.  Person Re-identification via Recurrent Feature Aggregation , 2016, ECCV.

[61]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[62]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Luc Van Gool,et al.  One-Shot Person Re-identification with a Consumer Depth Camera , 2014, Person Re-Identification.

[64]  Björn W. Schuller,et al.  The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits , 2014, J. Vis. Commun. Image Represent..

[65]  Dimitrios Tzovaras,et al.  Gait Recognition Using Compact Feature Extraction Transforms and Depth Information , 2007, IEEE Transactions on Information Forensics and Security.

[66]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Anton van den Hengel,et al.  Learning to rank in person re-identification with metric ensembles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[69]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[70]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[73]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[74]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[75]  Li Fei-Fei,et al.  Recurrent Attention Models for Depth-Based Person Identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[77]  Shishir K. Shah,et al.  A survey of approaches and trends in person re-identification , 2014, Image Vis. Comput..

[78]  Xiaogang Wang,et al.  Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Nanning Zheng,et al.  Point to Set Similarity Based Deep Feature Learning for Person Re-Identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Quoc V. Le,et al.  Listen, Attend and Spell , 2015, ArXiv.

[81]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[82]  Trevor Darrell,et al.  Learning Features by Watching Objects Move , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[84]  Wojciech Zaremba,et al.  Learning to Execute , 2014, ArXiv.

[85]  Luc Van Gool,et al.  3D reconstruction of freely moving persons for re-identification with a depth sensor , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[86]  Amit K. Roy-Chowdhury,et al.  Temporal Model Adaptation for Person Re-identification , 2016, ECCV.

[87]  Xuelong Li,et al.  Person Re-Identification by Regularized Smoothing KISS Metric Learning , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[88]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Zhen Zhou,et al.  See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[90]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[92]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[93]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Alessio Del Bue,et al.  Re-identification with RGB-D Sensors , 2012, ECCV Workshops.

[95]  Qi Tian,et al.  Scalable Person Re-identification on Supervised Smoothed Manifold , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[96]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Shiliang Zhang,et al.  Deep Attributes Driven Multi-Camera Person Re-identification , 2016, ECCV.

[98]  Qi Tian,et al.  MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.

[99]  Alberto Del Bimbo,et al.  Matching People across Camera Views using Kernel Canonical Correlation Analysis , 2014, ICDSC.

[100]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Luis Herranz,et al.  Depth CNNs for RGB-D Scene Recognition: Learning from Scratch Better than Transferring from RGB-CNNs , 2017, AAAI.

[102]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[103]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[104]  Rita Cucchiara,et al.  People reidentification in surveillance and forensics , 2013, ACM Comput. Surv..

[105]  Ricardo Matsumura de Araújo,et al.  Anthropometric and human gait identification using skeleton data from Kinect sensor , 2014, SAC.

[106]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[107]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[108]  Dacheng Tao,et al.  Person Re-Identification Over Camera Networks Using Multi-Task Distance Metric Learning , 2014, IEEE Transactions on Image Processing.

[109]  Sridha Sridharan,et al.  Gait energy volumes and frontal gait recognition using depth images , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[110]  Hua Li,et al.  3D gait recognition using multiple cameras , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).