Multi-Level Graph Encoding with Structural-Collaborative Relation Learning for Skeleton-Based Person Re-Identification

Skeleton-based person re-identification (Re-ID) is an emerging open topic providing great value for safety-critical applications. Existing methods typically extract hand-crafted features or model skeleton dynamics from the trajectory of body joints, while they rarely explore valuable relation information contained in body structure or motion. To fully explore body relations, we construct graphs to model human skeletons from different levels, and for the first time propose a Multi-level Graph encoding approach with Structural-Collaborative Relation learning (MG-SCR) to encode discriminative graph features for person Re-ID. Specifically, considering that structurally-connected body components are highly correlated in a skeleton, we first propose a multi-head structural relation layer to learn different relations of neighbor bodycomponent nodes in graphs, which helps aggregate key correlative features for effective node representations. Second, inspired by the fact that bodycomponent collaboration in walking usually carries recognizable patterns, we propose a cross-level collaborative relation layer to infer collaboration between different level components, so as to capture more discriminative skeleton graph features. Finally, to enhance graph dynamics encoding, we propose a novel self-supervised sparse sequential prediction task for model pre-training, which facilitates encoding high-level graph semantics for person Re-ID. MG-SCR outperforms state-of-the-art skeleton-based methods, and it achieves superior performance to many multi-modal methods that utilize extra RGB or depth features. Our codes are available at https://github.com/Kali-Hac/MG-SCR.

[1]  Deva Ramanan,et al.  3D Human Pose Estimation = 2D Pose Estimation + Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ricardo Matsumura de Araújo,et al.  Person Identification Using Anthropometric and Gait Data from Kinect Sensor , 2015, AAAI.

[3]  Zicheng Liu,et al.  Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-identification , 2017, ECCV.

[4]  Tieniu Tan,et al.  A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Mingkui Tan,et al.  A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D Skeleton Based Person Re-Identification , 2021, IEEE transactions on pattern analysis and machine intelligence.

[6]  Jian-Huang Lai,et al.  Robust Depth-Based Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[7]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[8]  Wenjun Zeng,et al.  Densely Semantically Aligned Person Re-Identification , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Alberto Del Bimbo,et al.  Enhanced skeleton and face 3D data for person re-identification from depth cameras , 2019, Comput. Graph..

[10]  A. B. Drought,et al.  WALKING PATTERNS OF NORMAL MEN. , 1964, The Journal of bone and joint surgery. American volume.

[11]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[12]  Mingkui Tan,et al.  Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification , 2020, IJCAI.

[13]  Wang Kejun,et al.  A behavior classification based on Enhanced Gait Energy Image , 2010, 2010 International Conference on Networking and Digital Society.

[14]  Dock Bumpers,et al.  Volume 2 , 2005, Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design, 2005..

[15]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[16]  Luc Van Gool,et al.  One-Shot Person Re-identification with a Consumer Depth Camera , 2014, Person Re-Identification.

[17]  Ana L. N. Fred,et al.  Context-Aware Person Re-Identification in the Wild Via Fusion of Gait and Anthropometric Features , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[18]  Fei Han,et al.  Space-Time Representation of People Based on 3D Skeletal Data: A Review , 2016, Comput. Vis. Image Underst..

[19]  Yanfeng Wang,et al.  Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Luc Van Gool,et al.  3D reconstruction of freely moving persons for re-identification with a depth sensor , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[21]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.

[22]  Li Fei-Fei,et al.  Recurrent Attention Models for Depth-Based Person Identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Torsten Bumgarner,et al.  Biomechanics and Motor Control of Human Movement , 2013 .

[24]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[25]  Rita Cucchiara,et al.  People reidentification in surveillance and forensics , 2013, ACM Comput. Surv..

[26]  Zhaoxiang Zhang,et al.  Relational Network for Skeleton-Based Action Recognition , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).