Recognizing Actions in 3D Using Action-Snippets and Activated Simplices

Pose-based action recognition in 3D is the task of recognizing an action (e.g., walking or running) from a sequence of 3D skeletal poses. This is challenging because of variations due to different ways of performing the same action and inaccuracies in the estimation of the skeletal poses. The training data is usually small and hence complex classifiers risk over-fitting the data. We address this task by action-snippets which are short sequences of consecutive skeletal poses capturing the temporal relationships between poses in an action. We propose a novel representation for action-snippets, called activated simplices. Each activity is represented by a manifold which is approximated by an arrangement of activated simplices. A sequence (of action-snippets) is classified by selecting the closest manifold and outputting the corresponding activity. This is a simple classifier which helps avoid over-fitting the data but which significantly outperforms state-of-the-art methods on standard benchmarks.

[1]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Lorenzo Rosasco,et al.  Learning Manifolds with K-Means and K-Flats , 2012, NIPS.

[3]  Michael Elad,et al.  Dictionary Learning for Analysis-Synthesis Thresholding , 2014, IEEE Transactions on Signal Processing.

[4]  Ngoc Quoc Ly,et al.  Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences , 2013, The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF).

[5]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Patrick S. Huggins,et al.  Sparse coding via geometry , 2005 .

[8]  G. Ziegler Lectures on Polytopes , 1994 .

[9]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[10]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[11]  Lourdes Agapito,et al.  Learning a Manifold as an Atlas , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Arif Mahmood,et al.  Real time action recognition using histograms of depth gradients and random decision forests , 2014, IEEE Winter Conference on Applications of Computer Vision.

[13]  Alan L. Yuille,et al.  An Approach to Pose-Based Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  BlakeAndrew,et al.  Real-time human pose recognition in parts from single depth images , 2013 .

[15]  Alberto Del Bimbo,et al.  Submitted to Ieee Transactions on Cybernetics 1 3d Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold , 2022 .

[16]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[21]  Hairong Qi,et al.  Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Luc Van Gool,et al.  Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[24]  Samy Bengio,et al.  Group Sparse Coding , 2009, NIPS.

[25]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[26]  Jimeng Sun,et al.  Automatic Group Sparse Coding , 2011, AAAI.

[27]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[28]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[29]  Volkan Cevher,et al.  Sparse projections onto the simplex , 2012, ICML.

[30]  Qionghai Dai,et al.  Action-Gons: Action Recognition with a Discriminative Dictionary of Structured Elements with Varying Granularity , 2014, ACCV.

[31]  Lei Zhang,et al.  Projective dictionary pair learning for pattern classification , 2014, NIPS.

[32]  C. Ji An Archetypal Analysis on , 2005 .

[33]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Alexandros André Chaaraoui,et al.  A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset , 2014, ArXiv.

[35]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.