Combined shape analysis of human poses and motion units for action segmentation and recognition

Recognizing human actions or analyzing human behaviors from 3D videos is an important problem currently investigated in many research domains. The high complexity of human motions and the variability of gesture combinations make this task challenging. Local (over time) analysis of a sequence is often necessary in order to have a more accurate and thorough understanding of what the human is doing. In this paper, we propose a method based on the combination of pose-based and segment-based approaches in order to segment an action sequence into motion units (MUs). We jointly analyze the shape of the human pose and the shape of its motion using a shape analysis framework that represents and compares shapes in a Riemannian manifold. On one hand, this allows us to detect periodic MUs and thus perform action segmentation. On another hand, we can remove repetitions of gestures in order to handle with failure cases for the task of action recognition. Experiments are performed on three representative datasets for the task of action segmentation and action recognition. Competitive results with state-of-the-art methods are obtained in both the tasks.

[1]  Lynne E. Parker,et al.  Fuzzy segmentation and recognition of continuous human activities , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Maja J. Mataric,et al.  Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[3]  Hairong Qi,et al.  Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jake K. Aggarwal,et al.  Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[7]  Arezoo Eshraghi,et al.  Vicon Motion System , 2014 .

[8]  Sebastian Nowozin,et al.  Efficient Nonlinear Markov Models for Human Motion , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Cristian Sminchisescu,et al.  The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Anuj Srivastava,et al.  A Novel Representation for Riemannian Analysis of Elastic Curves in Rn , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Helena M. Mentis,et al.  Instructing people for training gestural interactive systems , 2012, CHI.

[12]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[14]  Mohan M. Trivedi,et al.  Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Jessica K. Hodgins,et al.  Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[17]  Jernej Barbic,et al.  Segmenting Motion Capture Data into Distinct Behaviors , 2004, Graphics Interface.