Human action recognition with skeletal information from depth camera

We propose a human action recognition solution from the human's skeletal information. The angular representation of the skeleton shows its invariance to the scale of the actor and the orientation to the camera, while it maintains the correlation among different body parts. A modified Dynamic Time Warping (DTW) as a template matching solution is applied to do the action classification task. We collect our data with XBOX Kinect platform, a well-known Chinese traditional shadow boxing named Taiji is recognized based on types of actions which achieves the accuracy of 80%.

[1]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[2]  Daniel P. W. Ellis,et al.  Ground-truth transcriptions of real music from force-aligned MIDI syntheses , 2003, ISMIR.

[3]  Li Wang,et al.  Human Action Segmentation and Recognition Using Discriminative Semi-Markov Models , 2011, International Journal of Computer Vision.

[4]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[5]  Seok-Woo Jang,et al.  Branch-and-bound dynamic time warping , 2010 .

[6]  Darko Kirovski,et al.  Real-time classification of dance gestures from skeleton animation , 2011, SCA '11.

[7]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[9]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.