A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D Skeleton Based Person Re-Identification

Person re-identification (Re-ID) via gait features within 3D skeleton sequences is a newly-emerging topic with several advantages. Existing solutions either rely on hand-crafted descriptors or supervised gait representation learning. This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID. Specifically, we first create self-supervision by learning to reconstruct unlabeled skeleton sequences reversely, which involves richer high-level semantics to obtain better gait representations. Other pretext tasks are also explored to further improve self-supervised learning. Second, inspired by the fact that motion's continuity endows adjacent skeletons in one skeleton sequence and temporally consecutive skeleton sequences with higher correlations (referred as locality in 3D skeleton data), we propose a locality-aware attention mechanism and a locality-aware contrastive learning scheme, which aim to preserve locality-awareness on intra-sequence level and inter-sequence level respectively during self-supervised learning. Last, with context vectors learned by our locality-aware attention mechanism and contrastive learning scheme, a novel feature named Constrastive Attention-based Gait Encodings (CAGEs) is designed to represent gait effectively. Empirical evaluations show that our approach significantly outperforms skeleton-based counterparts by 15-40% Rank-1 accuracy, and it even achieves superior performance to numerous multi-modal methods with extra RGB or depth information. Our codes are available at this https URL.

[1]  Luc Van Gool,et al.  3D reconstruction of freely moving persons for re-identification with a depth sensor , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Shiqi Yu,et al.  A comprehensive study on gait biometrics using a joint CNN-based method , 2019, Pattern Recognit..

[3]  Shaogang Gong,et al.  Person Re-Identification by Discriminative Selection in Video Ranking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Tao Xiang,et al.  Leader-Based Multi-Scale Attention Deep Architecture for Person Re-Identification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[7]  Shih-Fu Chang,et al.  Unsupervised Embedding Learning via Invariant and Spreading Instance Feature , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Matteo Munaro,et al.  A feature-based approach to people re-identification using skeleton keypoints , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[11]  Chang-Tsun Li,et al.  On Reducing the Effect of Covariate Factors in Gait Recognition: A Classifier Ensemble Method , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Gunawan Ariyanto,et al.  Model-based 3D gait biometrics , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[13]  Aaron F. Bobick,et al.  Gait recognition from time-normalized joint-angle trajectories in the walking plane , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Rita Cucchiara,et al.  SARC3D: A New 3D Body Model for People Tracking and Re-identification , 2011, ICIAP.

[15]  Shaogang Gong,et al.  Unsupervised Tracklet Person Re-Identification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Tieniu Tan,et al.  Silhouette Analysis-Based Gait Recognition for Human Identification , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Zhaoxiang Zhang,et al.  Relational Network for Skeleton-Based Action Recognition , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[18]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[19]  Arun Ross,et al.  Biometric recognition by gait: A survey of modalities and features , 2018, Comput. Vis. Image Underst..

[20]  Li Fei-Fei,et al.  Recurrent Attention Models for Depth-Based Person Identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Alfredo Petrosino,et al.  TGLSTM: A time based graph deep learning approach to gait recognition , 2019, Pattern Recognit. Lett..

[22]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Larry S. Davis,et al.  Multi-Task Learning with Low Rank Attribute Embedding for Person Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[25]  Wang Kejun,et al.  A behavior classification based on Enhanced Gait Energy Image , 2010, 2010 International Conference on Networking and Digital Society.

[26]  Luc Van Gool,et al.  One-Shot Person Re-identification with a Consumer Depth Camera , 2014, Person Re-Identification.

[27]  Chris J. Harris,et al.  Extracting Gait Signatures based on Anatomical Knowledge , 2002 .

[28]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[29]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[30]  J. Cutting,et al.  Recognizing friends by their walk: Gait perception without familiarity cues , 1977 .

[31]  Ana L. N. Fred,et al.  Context-Aware Person Re-Identification in the Wild Via Fusion of Gait and Anthropometric Features , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[32]  Sudeep Sarkar,et al.  Improved gait recognition by gait dynamics normalization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jian-Huang Lai,et al.  Person Re-Identification by Camera Correlation Aware Feature Augmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[35]  Chengxu Zhuang,et al.  Local Aggregation for Unsupervised Learning of Visual Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Ricardo Matsumura de Araújo,et al.  Person Identification Using Anthropometric and Gait Data from Kinect Sensor , 2015, AAAI.

[37]  Jian-Huang Lai,et al.  Robust Depth-Based Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[38]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[39]  Shaogang Gong,et al.  International Journal of Computer Vision (The original publication is available at www.springerlink.com) Time-Delayed Correlation Analysis for Multi-Camera Activity Understanding , 2009 .

[40]  Mohamed Hasan,et al.  Long-term people reidentification using anthropometric signature , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[41]  Wei-Shi Zheng,et al.  Unsupervised Person Re-Identification by Deep Asymmetric Metric Embedding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Xiaogang Wang,et al.  Person Re-Identification by Saliency Learning , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Fei Han,et al.  Space-Time Representation of People Based on 3D Skeletal Data: A Review , 2016, Comput. Vis. Image Underst..

[44]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[46]  A. B. Drought,et al.  WALKING PATTERNS OF NORMAL MEN. , 1964, The Journal of bone and joint surgery. American volume.

[47]  Sridha Sridharan,et al.  Gait energy volumes and frontal gait recognition using depth images , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[48]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[49]  LaiJian-Huang,et al.  Robust Depth-Based Person Re-Identification , 2017 .

[50]  Alberto Del Bimbo,et al.  Enhanced skeleton and face 3D data for person re-identification from depth cameras , 2019, Comput. Graph..

[51]  Zicheng Liu,et al.  Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-identification , 2017, ECCV.

[52]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[53]  Chen Wang,et al.  Human Identification Using Temporal Information Preserving Gait Template , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[55]  Shiqi Yu,et al.  A model-based gait recognition method with body pose and human prior knowledge , 2020, Pattern Recognit..

[56]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[57]  Rita Cucchiara,et al.  People reidentification in surveillance and forensics , 2013, ACM Comput. Surv..

[58]  Tieniu Tan,et al.  Fusion of static and dynamic body biometrics for gait recognition , 2003, IEEE Transactions on Circuits and Systems for Video Technology.

[59]  Alessio Del Bue,et al.  Re-identification with RGB-D Sensors , 2012, ECCV Workshops.