Spatial Multi-scale Motion History Histograms and Its Applications

Precisely describing the action inside of a video is a challenging task because the content of the video includes various objects, with different local motion information at different speed in the video frames. In this paper, a new video feature is proposed based on the spatial information of the objects in a frame, along with the motion information between one against multiple consecutive frames. Motion information between pixels at the same position in the whole video are all combined for a new Spatial Multi-Scale Motion History Histogram (SMMHH) dynamic descriptor. The detailed algorithm of the SMMHH was given and it is tested in both human action recognition and touch gesture recognition applications based on the public video datasets. Experimental results demonstrate its excellent performance compared to other traditional methods.

[1]  Lionel Prevost,et al.  Facial Action Recognition Combining Heterogeneous Features via Multikernel Learning , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Michael J. Freeman,et al.  Motion history histograms for human action recognition , 2009 .

[3]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Michel F. Valstar,et al.  Local Gabor Binary Patterns from Three Orthogonal Planes for Automatic Facial Expression Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[7]  Dirk Heylen,et al.  Touching the Void -- Introducing CoST: Corpus of Social Touch , 2014, ICMI.

[8]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[9]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hongying Meng,et al.  A Human Action Recognition System for Embedded Computer Vision Application , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[12]  Maja Pantic,et al.  Action unit detection using sparse appearance descriptors in space-time video volumes , 2011, Face and Gesture 2011.

[13]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Hongying Meng,et al.  Descriptive temporal template features for visual motion recognition , 2009, Pattern Recognit. Lett..

[15]  Ling Shao,et al.  Human Action Recognition Using LBP-TOP as Sparse Spatio-Temporal Feature Descriptor , 2009, CAIP.

[16]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  James W. Davis,et al.  Action Recognition Using Temporal Templates , 1997 .