Spatiotemporal representation of 3D skeleton joints-based action recognition using modified spherical harmonics

Action recognition based on the 3D coordinates of body skeleton joints is an important topic in computer vision applications and human robot interaction. At present, most 3D data are captured using recently introduced economical depth sensors. In this study, we explore a new method for skeleton-based human action recognition. In this novel framework, the normalized angles of local joints are first extracted, and then the modified spherical harmonics (MSHs) are used to explicitly model the angular skeleton by projecting the spherical angles onto the unit sphere basis. This process decomposes the skeleton representation into a set of basis functions. A spatiotemporal system of the spherical angles is adopted to construct the static pose and joint displacement over a human action sequence. Consequently, the MSHs coefficients of the joints are used as the discriminative descriptor of the sequence. The extreme learning machine (ELM) classifier and recently published 3D action datasets are used to validate the proposed method. The experimental results show that the proposed approach performs better than many classical methods

[1]  Alberto Del Bimbo,et al.  Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[2]  K. Gorski,et al.  HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere , 2004, astro-ph/0409513.

[3]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[4]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[5]  Qing Zhang,et al.  A Survey on Human Motion Analysis from Depth Data , 2013, Time-of-Flight and Depth Imaging.

[6]  Mathieu Barnachon,et al.  A real-time system for motion retrieval and interpretation , 2013, Pattern Recognit. Lett..

[7]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Q. M. Jonathan Wu,et al.  Human action recognition using extreme learning machine based on visual vocabularies , 2010, Neurocomputing.

[9]  J. Charles,et al.  A Sino-German λ 6 cm polarization survey of the Galactic plane I . Survey strategy and results for the first survey region , 2006 .

[10]  Ramakant Nevatia,et al.  Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost , 2006, ECCV.

[11]  Dejan V. VraniC An improvement of rotation invariant 3D-shape based on functions on concentric spheres , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[12]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[13]  Youssef Chahir,et al.  3-D skeleton joints-based action recognition using covariance descriptors on discrete spherical harmonics transform , 2015, ICIP 2015.

[14]  Georgios Evangelidis,et al.  Skeletal Quads: Human Action Recognition Using Joint Quadruples , 2014, 2014 22nd International Conference on Pattern Recognition.

[15]  Luc Van Gool,et al.  Does Human Action Recognition Benefit from Pose Estimation? , 2011, BMVC.

[16]  Lei Zhang,et al.  Face recognition from a single training image under arbitrary unknown lighting using spherical harmonics , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Darko Kirovski,et al.  Real-time classification of dance gestures from skeleton animation , 2011, SCA '11.

[18]  Atsushi Shimada,et al.  Gesture recognition using sparse code of Hierarchical SOM , 2008, 2008 19th International Conference on Pattern Recognition.

[19]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[20]  Dimitrios Makris,et al.  G3D: A gaming action dataset and real time action recognition evaluation framework , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[22]  Markus Koskela,et al.  Skeleton-based action recognition with extreme learning machines , 2015, Neurocomputing.

[23]  Willi Freeden,et al.  Spherical Functions of Mathematical Geosciences: A Scalar, Vectorial, and Tensorial Setup , 2008, Geosystems Mathematics.

[24]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Zhou Ya-li Face recognition from a single training image-a sift approach , 2011 .

[26]  Xin Zhao,et al.  Human action recognition based on semi-supervised discriminant analysis with global constraint , 2013, Neurocomputing.

[27]  Guodong Guo,et al.  Fusing Spatiotemporal Features and Joints for 3D Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[28]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[29]  R. A. Silverman,et al.  Special functions and their applications , 1966 .

[30]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Mario Fernando Montenegro Campos,et al.  Distance matrices as invariant features for classifying MoCap data , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[32]  Stefano Soatto,et al.  Flexible Dictionaries for Action Classification , 2008 .

[33]  Jake K. Aggarwal,et al.  Human activity recognition from 3D data: A review , 2014, Pattern Recognit. Lett..

[34]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[36]  Qiang Ji,et al.  Capturing Global and Local Dynamics for Human Action Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[37]  Youssef Chahir,et al.  3D-Posture Recognition Using Joint Angle Representation , 2014, IPMU.

[38]  Stepán Obdrzálek,et al.  Accuracy and robustness of Kinect pose estimation in the context of coaching of elderly population , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[39]  BENJAMIN BUSTOS,et al.  Feature-based similarity search in 3D object databases , 2005, CSUR.

[40]  Mohan M. Trivedi,et al.  Joint Angles Similarities and HOG2 for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[41]  Fatemeh Tahavori,et al.  A quantitative assessment of using the Kinect for Xbox 360 for respiratory surface motion tracking , 2012, Medical Imaging.

[42]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Hahn-Ming Lee,et al.  Recognition of Human Actions Using Motion Capture Data and Support Vector Machine , 2009, 2009 WRI World Congress on Software Engineering.

[44]  Alberto Del Bimbo,et al.  Space-Time Pose Representation for 3D Human Action Recognition , 2013, ICIAP Workshops.