Real-Time Multi-scale Action Detection from 3D Skeleton Data

In this paper we introduce a real-time system for action detection. The system uses a small set of robust features extracted from 3D skeleton data. Features are effectively described based on the probability distribution of skeleton data. The descriptor computes a pyramid of sample covariance matrices and mean vectors to encode the relationship between the features. For handling the intra-class variations of actions, such as action temporal scale variations, the descriptor is computed using different window scales for each action. Discriminative elements of the descriptor are mined using feature selection. The system achieves accurate detection results on difficult unsegmented sequences. Experiments on MSRC-12 and G3D datasets show that the proposed system outperforms the state-of-the-art in detection accuracy with very low latency. To the best of our knowledge, we are the first to propose using multi-scale description in action detection from 3D skeleton data.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Marwan Torki,et al.  Histogram of Oriented Displacements (HOD): Describing Trajectories of Human Joints for Action Recognition , 2013, IJCAI.

[4]  Yanqing Zhang,et al.  Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis , 2007, TCBB.

[5]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[6]  Cristian Sminchisescu,et al.  The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Ruzena Bajcsy,et al.  Sequence of the Most Informative Joints (SMIJ): A new representation for human skeletal action recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[9]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[10]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[12]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Quan Z. Sheng,et al.  Online human gesture recognition from motion data streams , 2013, ACM Multimedia.

[14]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[15]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[16]  Alexandros André Chaaraoui,et al.  Evolutionary joint selection to improve human action recognition with RGB-D devices , 2014, Expert Syst. Appl..

[17]  Dimitrios Makris,et al.  Dynamic Feature Selection for Online Action Recognition , 2013, HBU.

[18]  Sebastian Nowozin,et al.  Action Points: A Representation for Low-latency Online Human Action Recognition , 2012 .

[19]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[20]  Helena M. Mentis,et al.  Instructing people for training gestural interactive systems , 2012, CHI.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[23]  Hairong Qi,et al.  Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Dimitrios Makris,et al.  G3D: A gaming action dataset and real time action recognition evaluation framework , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[26]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[27]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.