GaitSet: Cross-View Gait Recognition Through Utilizing Gait As a Deep Set

Gait is a unique biometric feature recognized at a distance and has broad applications in crime prevention, forensic identification and social security. To portray a gait, existing gait recognition methods utilize either a gait template, which makes it difficult to preserve temporal information, or a gait sequence, which maintain unnecessary sequential constraints and loses the flexibility of gait recognition. In this paper we present a novel perspective that utilizes gait as a deep set, meaning that a set of gait frames are integrated by a global-local fused deep network inspired by the way our left- and right-hemisphere processes information to learn information that can be used in identification. Based on this deep set perspective, our method is immune to frame permutations, and naturally integrate frames from different videos that have been acquired under different scenarios, such as diverse viewing angles, different clothes, or different item-carrying conditions. Experiments show that under normal walking conditions, our single-model method achieves an average rank-1 accuracy of 96.1\% on the CASIA-B gait dataset and an accuracy of 87.9\% on the OU-MVLP gait dataset. Moreover, the proposed method maintains a satisfactory accuracy even when only small numbers of frames are available in the test samples.

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Liang Wang,et al.  Cross-View Gait Recognition by Discriminative Feature Learning , 2020, IEEE Transactions on Image Processing.

[3]  Jian Weng,et al.  Multi-Gait Recognition Based on Attribute Discovery , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Qing Li,et al.  GaitPart: Temporal Part-Based Model for Gait Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Yunchao Wei,et al.  Horizontal Pyramid Matching for Person Re-identification , 2018, AAAI.

[7]  Jonathan Krause,et al.  A Hierarchical Approach for Generating Descriptive Image Paragraphs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Tieniu Tan,et al.  A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Yasushi Makihara,et al.  Gait Recognition Using a View Transformation Model in the Frequency Domain , 2006, ECCV.

[10]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[11]  Yang Yu,et al.  Performance Evaluation of Model-Based Gait on Multi-View Very Large Population Database With Pose Sequences , 2020, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[12]  Joon Son Chung,et al.  VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.

[13]  Jianbing Shen,et al.  Triplet Loss in Siamese Network for Object Tracking , 2018, ECCV.

[14]  Yasushi Makihara,et al.  Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition , 2018, IPSJ Transactions on Computer Vision and Applications.

[15]  Shaogang Gong,et al.  Gait recognition without subject cooperation , 2010, Pattern Recognit. Lett..

[16]  Chang-Tsun Li,et al.  On Reducing the Effect of Covariate Factors in Gait Recognition: A Classifier Ensemble Method , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[18]  LinLin Shen,et al.  Invariant feature extraction for gait recognition using only one uniform model , 2017, Neurocomputing.

[19]  Xiaogang Wang,et al.  A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[21]  Hongming Shan,et al.  Multi-Task GANs for View-Specific Feature Learning in Gait Recognition , 2019, IEEE Transactions on Information Forensics and Security.

[22]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yongzhen Huang,et al.  Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition , 2020, ECCV.

[24]  Qiang Wu,et al.  Recognizing Gaits Across Views Through Correlated Motion Co-Clustering , 2014, IEEE Transactions on Image Processing.

[25]  Shiqi Yu,et al.  GaitGAN: Invariant Gait Feature Extraction Using Generative Adversarial Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  Sudeep Sarkar,et al.  Improved gait recognition by gait dynamics normalization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[28]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[29]  Xiong Chen,et al.  Learning Discriminative Features with Multiple Granularities for Person Re-Identification , 2018, ACM Multimedia.

[30]  Thomas Wolf,et al.  Multi-view gait recognition using 3D convolutional neural networks , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[31]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[34]  Weiyu Guo,et al.  Spatial-Temporal Graph Attention Network for Video-Based Gait Recognition , 2019, ACPR.

[35]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[36]  Bir Bhanu,et al.  Individual recognition using gait energy image , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Xianglei Xing,et al.  Complete canonical correlation analysis with application to multi-view gait recognition , 2016, Pattern Recognit..

[38]  Yasushi Makihara,et al.  GEINet: View-invariant gait recognition using a convolutional neural network , 2016, 2016 International Conference on Biometrics (ICB).

[39]  Shaogang Gong,et al.  Harmonious Attention Network for Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Shiqi Yu,et al.  Pose-Based Temporal-Spatial Network (PTSN) for Gait Recognition with Carrying and Clothing Variations , 2017, CCBR.

[41]  Chen Wang,et al.  Human Identification Using Temporal Information Preserving Gait Template , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Yasushi Makihara,et al.  On Input/Output Architectures for Convolutional Neural Network-Based Cross-View Gait Recognition , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Jianfeng Feng,et al.  GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition , 2018, AAAI.

[44]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  James J. Little,et al.  View-Invariant Discriminative Projection for Multi-View Gait-Based Human Identification , 2013, IEEE Transactions on Information Forensics and Security.

[46]  Monica Baciu,et al.  Cerebral regions and hemispheric specialization for processing spatial frequencies during natural scene recognition. An event-related fMRI study , 2004, NeuroImage.