Online Action Recognition based on Skeleton Motion Distribution

Online action recognition which aims to jointly detect and recognize actions from video streams, is an essential step towards a comprehensive understanding of human behavior. However, it is challenging to accurately locate and recognize the occurrence of actions from noisy data streams. This paper proposes a skeleton motion distribution based method for effective online action recognition. Specifically, an adaptive density estimation function is built to calculate the density distribution of skeleton movements. Observing that each action has a unique motion distribution, we detect the occurrence of actions by identifying the transition of the motion distribution in a video stream. Once the starting point of an action is detected, a snippet-based classifier is proposed for online action recognition, which continuously identifies the most likely action class. Experimental results demonstrate that our method outperforms the state-of-the-art methods in terms of both detection accuracy and recognition precision.

[1]  Honghai Liu,et al.  Combining 3D joints Moving Trend and Geometry property for human action recognition , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[2]  Nikos Paragios,et al.  Motion-based background subtraction using adaptive kernel density estimation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[4]  Yong Du,et al.  Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Anton van den Hengel,et al.  Learning discriminative trajectorylet detector sets for accurate skeleton-based action recognition , 2015, Pattern Recognit..

[6]  Fernando De la Torre,et al.  Joint segmentation and classification of human actions in video , 2011, CVPR 2011.

[7]  M. Hazelton Variable kernel density estimation , 2003 .

[8]  Fernando De la Torre,et al.  Max-Margin Early Event Detectors , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Marco Morana,et al.  Human Activity Recognition Process Using 3-D Posture Data , 2015, IEEE Transactions on Human-Machine Systems.

[10]  Sebastian Nowozin,et al.  Action Points: A Representation for Low-latency Online Human Action Recognition , 2012 .

[11]  Hugo Jair Escalante,et al.  A naïve Bayes baseline for early gesture recognition , 2016, Pattern Recognit. Lett..

[12]  Yale Song,et al.  Continuous body and hand gesture recognition for natural human-computer interaction , 2012, TIIS.

[13]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Tian-Tsong Ng,et al.  Multimodal Multipart Learning for Action Recognition in Depth Videos , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Aleksandar Lazarevic,et al.  Outlier Detection with Kernel Density Functions , 2007, MLDM.

[16]  Michael S. Ryoo,et al.  Human activity prediction: Early recognition of ongoing activities from streaming videos , 2011, 2011 International Conference on Computer Vision.

[17]  Alberto Del Bimbo,et al.  Motion segment decomposition of RGB-D sequences for human behavior understanding , 2017, Pattern Recognit..

[18]  Radu Horaud,et al.  will be inserted by the editor ) Continuous Action Recognition Based on Sequence Alignment , 2017 .

[19]  Silvio Savarese,et al.  Watch-n-patch: Unsupervised understanding of actions and relations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yun Fu,et al.  A Discriminative Model with Multiple Temporal Scales for Action Prediction , 2014, ECCV.

[21]  Ling Shao,et al.  Human action segmentation and recognition via motion and shape analysis , 2012, Pattern Recognit. Lett..

[22]  Amr Sharaf,et al.  Real-Time Multi-scale Action Detection from 3D Skeleton Data , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[23]  L. Davis,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002, Proc. IEEE.

[24]  Dimitrios Makris,et al.  Linear latent low dimensional space for online early action recognition and prediction , 2017, Pattern Recognit..

[25]  Youfu Li,et al.  DSRF: A flexible trajectory descriptor for articulated human action recognition , 2018, Pattern Recognit..

[26]  Juan Carlos Niebles,et al.  Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos , 2017, Image Vis. Comput..

[27]  Cees Snoek,et al.  Online Action Detection , 2016, ECCV.

[28]  Yun Fu,et al.  Bilinear heterogeneous information machine for RGB-D action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Xilin Chen,et al.  Two streams Recurrent Neural Networks for Large-Scale Continuous Gesture Recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[30]  Daijin Kim,et al.  Robust human activity recognition from depth video using spatiotemporal multi-fused features , 2017, Pattern Recognit..

[31]  Shih-Fu Chang,et al.  Action Temporal Localization in Untrimmed Videos via Multi-stage CNNs , 2016, ArXiv.