Extraction of Discriminative Patterns from Skeleton Sequences for Human Action Recognition

Emergence of novel techniques, such as the invention of MS Kinect, enables reliable extraction of human skeletons from action videos. Taking skeleton data as inputs, we propose an approach in this paper to extract the discriminative patterns for efficient human action recognition. Each action is considered to consist of a series of unit actions, each of which is represented by a pattern. Given a skeleton sequence, we first automatically extract the key-frames for unit actions, and then label them as different patterns. We further use a statistical metric to evaluate the discriminative capability of each pattern, and define the bag of reliable patterns as local features for action recognition. Experimental results show that the extracted local descriptors could provide very high accuracy in the action recognition, which demonstrate the efficiency of our method in extracting discriminative patterns.

[1]  Adrian Hilton,et al.  Automatic 3D Video Summarization: Key Frame Extraction from Self-Similarity , 2008 .

[2]  Xinghua Sun,et al.  Action recognition via local descriptors and holistic features , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[3]  Eugene Fiume,et al.  An efficient search algorithm for motion data using weighted PCA , 2005, SCA '05.

[4]  Hans-Peter Kriegel,et al.  3D Shape Histograms for Similarity Search and Classification in Spatial Databases , 1999, SSD.

[5]  Okan Arikan,et al.  Interactive motion generation from examples , 2002, ACM Trans. Graph..

[6]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[8]  E. B. Wilson Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[9]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[10]  Kotagiri Ramamohanarao,et al.  Moving shape dynamics: A signal processing perspective , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Arno Zinke,et al.  Fast local and global similarity searches in large motion capture databases , 2010, SCA '10.

[12]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Nadia Magnenat-Thalmann,et al.  Personalised real-time idle motion synthesis , 2004, 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings..

[14]  Adrian Hilton,et al.  Shape Similarity for 3D Video Sequences of People , 2010, International Journal of Computer Vision.

[15]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[16]  Krystian Mikolajczyk,et al.  Action recognition with appearance-motion features and fast search trees , 2011, Comput. Vis. Image Underst..

[17]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[18]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[20]  Guy W. Mineau,et al.  Beyond TFIDF Weighting for Text Categorization in the Vector Space Model , 2005, IJCAI.

[21]  Hans-Peter Seidel,et al.  An efficient algorithm for keyframe-based motion retrieval in the presence of temporal deformations , 2008, MIR '08.

[22]  J. Hodgins,et al.  Construction and optimal search of interpolated motion graphs , 2007, SIGGRAPH 2007.

[23]  Michael Gleicher,et al.  Automated extraction and parameterization of motions in large data sets , 2004, SIGGRAPH 2004.

[24]  Adrian Hilton,et al.  A Study of Shape Similarity for Temporal Surface Sequences of People , 2007, Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007).

[25]  Lucas Kovar,et al.  Motion graphs , 2002, SIGGRAPH Classes.

[26]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[27]  Jean-Yves Guillemaut,et al.  3D action matching with key-pose detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.