Learning Rich Features for Gait Recognition by Integrating Skeletons and Silhouettes

Gait recognition captures gait patterns from the walking sequence of an individual for identification. Most existing gait recognition methods learn features from silhouettes or skeletons for the robustness to clothing, carrying, and other exterior factors. The combination of the two data modalities, however, is not fully exploited. This paper proposes a simple yet effective bimodal fusion (BiFusion) network, which mines the complementary clues of skeletons and silhouettes, to learn rich features for gait identification. Particularly, the inherent hierarchical semantics of body joints in a skeleton is leveraged to design a novel Multi-scale Gait Graph (MSGG) network for the feature extraction of skeletons. Extensive experiments on CASIA-B and OUMVLP demonstrate both the superiority of the proposed MSGG network in modeling skeletons and the effectiveness of the bimodal fusion for gait recognition. Under the most challenging condition of walking in different clothes on CASIAB, our method achieves the rank-1 accuracy of 92.1%. The code will be released at https://github.com/YunjiePeng/BimodalFusion after the acceptance.

[1]  Shiqi Yu,et al.  A model-based gait recognition method with body pose and human prior knowledge , 2020, Pattern Recognit..

[2]  Deva Ramanan,et al.  3D Human Pose Estimation = 2D Pose Estimation + Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Arun Ross,et al.  Feature level fusion of hand and face biometrics , 2005, SPIE Defense + Commercial Sensing.

[4]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[5]  Yonghong Song,et al.  Gait Recognition Based on 3D Skeleton Data and Graph Convolutional Network , 2020, 2020 IEEE International Joint Conference on Biometrics (IJCB).

[6]  Imed Bouchrika,et al.  On Using Gait in Forensic Biometrics , 2011, Journal of forensic sciences.

[7]  Tieniu Tan,et al.  A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[8]  Yasushi Makihara,et al.  On Input/Output Architectures for Convolutional Neural Network-Based Cross-View Gait Recognition , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Ashok A. Ghatol,et al.  Iris recognition: an emerging biometric technology , 2007 .

[10]  Carlos D. Castillo,et al.  Deep Heterogeneous Feature Fusion for Template-Based Face Recognition , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Anil K. Jain,et al.  Handbook of Fingerprint Recognition , 2005, Springer Professional Computing.

[12]  Xin Yu,et al.  Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Qiang Wu,et al.  Coupled Bilinear Discriminant Projection for Cross-View Gait Recognition , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Xinghao Jiang,et al.  Gait Identification Based on Human Skeleton with Pairwise Graph Convolutional Network , 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME).

[16]  Gerhard Rigoll,et al.  GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition , 2021, ArXiv.

[17]  Yasushi Makihara,et al.  Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition , 2018, IPSJ Transactions on Computer Vision and Applications.

[18]  Jianfeng Feng,et al.  GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition , 2018, AAAI.

[19]  Xiaoming Liu,et al.  Gait Recognition via Disentangled Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yasushi Makihara,et al.  End-to-End Model-Based Gait Recognition , 2020, ACCV.

[21]  Yang Yu,et al.  Performance Evaluation of Model-Based Gait on Multi-View Very Large Population Database With Pose Sequences , 2020, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[22]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Chunhua Wang,et al.  Multimodal Feature-Level Fusion for Biometrics Identification System on IoMT Platform , 2018, IEEE Access.

[24]  Shunli Zhang,et al.  Gait Recognition with Multiple-Temporal-Scale 3D Convolutional Neural Network , 2020, ACM Multimedia.

[25]  Hefei Ling,et al.  Multi-View Gait Recognition Based on a Spatial-Temporal Deep Neural Network , 2018, IEEE Access.

[26]  Bir Bhanu,et al.  Individual recognition using gait energy image , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[28]  Qing Li,et al.  GaitPart: Temporal Part-Based Model for Gait Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Rama Chellappa,et al.  Joint Sparse Representation for Robust Multimodal Biometrics Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  M. Faundez-Zanuy,et al.  Data fusion in biometrics , 2005, IEEE Aerospace and Electronic Systems Magazine.

[31]  Xiaogang Wang,et al.  A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Hongkai Wen,et al.  Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Yongzhen Huang,et al.  Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition , 2020, ECCV.

[34]  Chiara Bartolozzi,et al.  Event-Based Vision: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[36]  Kecheng Zheng,et al.  Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Feng Liu,et al.  On Learning Disentangled Representations for Gait Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Nikolaos V. Boulgouris,et al.  Gait Recognition Using HMMs and Dual Discriminative Observations for Sub-Dynamics Analysis , 2013, IEEE Transactions on Image Processing.

[40]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.