Local Surface Geometric Feature for 3D human action recognition

This paper presents a novel Local Surface Geometric Feature (LSGF) for human action recognition from video sequences captured by a depth camera. The LSGF is extracted from each skeleton joint in point cloud space to capture the static appearance and pose cues, which includes joint position, normal, and local curvature. A temporal pyramid of covariance matrix is exploited to model both pairwise relations of features instead of features themselves and the temporal evolution. Finally, Fisher vector encoding is imported as a global representation for a video sequence and SVM classifier is used for classification. In the extensive experiments, we achieve classification results superior to most of previous published results on three public benchmark datasets, i.e., MSR-Action3D, MSR DailyActivity3D, and UTKinect Action.

[1]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[2]  Nicu Sebe,et al.  GLocal tells you more: Coupling GLocal structural for feature selection with sparsity for image and video classification , 2014, Comput. Vis. Image Underst..

[3]  Subramanian Ramanathan,et al.  Multitask Linear Discriminant Analysis for View Invariant Action Recognition , 2014, IEEE Transactions on Image Processing.

[4]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[5]  Mario Fernando Montenegro Campos,et al.  STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences , 2012, CIARP.

[6]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[7]  Gang Wang,et al.  Discriminative multi-manifold analysis for face recognition from a single training sample per person , 2011, 2011 International Conference on Computer Vision.

[8]  Gang Wang,et al.  Human Identity and Gender Recognition From Gait Sequences With Arbitrary Walking Directions , 2014, IEEE Transactions on Information Forensics and Security.

[9]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alexander M. Bronstein,et al.  Numerical Geometry of Non-Rigid Shapes , 2009, Monographs in Computer Science.

[11]  Jiwen Lu,et al.  Cost-Sensitive Subspace Analysis and Extensions for Face Recognition , 2013, IEEE Transactions on Information Forensics and Security.

[12]  Guangfeng Lin,et al.  Feature structure fusion modelling for classification , 2015, IET Image Process..

[13]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[14]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[15]  Guangfeng Lin,et al.  Feature structure fusion and its application , 2014, Inf. Fusion.

[16]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[18]  Anuj Srivastava,et al.  Action Recognition Using Rate-Invariant Analysis of Skeletal Shape Trajectories , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Subramanian Ramanathan,et al.  No Matter Where You Are: Flexible Graph-Guided Multi-task Learning for Multi-view Head Pose Classification under Target Motion , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[21]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[22]  Nicu Sebe,et al.  Egocentric Daily Activity Recognition via Multitask Clustering , 2015, IEEE Transactions on Image Processing.

[23]  Jiwen Lu,et al.  Regularized Locality Preserving Projections and Its Extensions for Face Recognition , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Wei-Yun Yau,et al.  Human Action Recognition With Video Data: Research and Evaluation Challenges , 2014, IEEE Transactions on Human-Machine Systems.

[25]  Arif Mahmood,et al.  HOPC: Histogram of Oriented Principal Components of 3D Pointclouds for Action Recognition , 2014, ECCV.

[26]  N. Ayache,et al.  Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[27]  Gang Wang,et al.  Reconstruction-Based Metric Learning for Unconstrained Face Verification , 2015, IEEE Transactions on Information Forensics and Security.

[28]  Shuicheng Yan,et al.  Body Surface Context: A New Robust Feature for Action Recognition From Depth Videos , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[30]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[31]  Nasser Kehtarnavaz,et al.  Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[32]  Xiaodong Yang,et al.  Recognizing actions using depth motion maps-based histograms of oriented gradients , 2012, ACM Multimedia.

[33]  Nasser Kehtarnavaz,et al.  Real-time human action recognition based on depth motion maps , 2013, Journal of Real-Time Image Processing.

[34]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Gang Wang,et al.  Image-to-Set Face Recognition Using Locality Repulsion Projections and Sparse Reconstruction-Based Similarity Measure , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[36]  Junsong Yuan,et al.  Learning Actionlet Ensemble for 3D Human Action Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Xiaodong Yang,et al.  Super Normal Vector for Activity Recognition Using Depth Sequences , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Subramanian Ramanathan,et al.  Clustered Multi-task Linear Discriminant Analysis for View Invariant Color-Depth Action Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[39]  Brian C. Lovell,et al.  Spatio-temporal covariance descriptors for action and gesture recognition , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[40]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[41]  Jiwen Lu,et al.  Neighborhood repulsed metric learning for kinship verification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Nicu Sebe,et al.  Event Oriented Dictionary Learning for Complex Event Detection , 2015, IEEE Transactions on Image Processing.

[43]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[44]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Alexandros André Chaaraoui,et al.  Evolutionary joint selection to improve human action recognition with RGB-D devices , 2014, Expert Syst. Appl..

[46]  Jake K. Aggarwal,et al.  Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.