A Global Spatio-Temporal Representation for Action Recognition

In this paper we introduce an effective method to construct a global spatio-temporal representation for action recognition. This representation is inspired by the fact that human actions can be treated as 3D shapes induced by the silhouettes in the space-time volume. We estimate the silhouettes which contain detailed shape information of the action, and present an efficient sampling method to extract interest points along the silhouettes. The local interest point is represented by a spatio-temporal descriptor based on 2D DAISY. Our global space-time representation is the integration of these local descriptors in an order along the silhouette. In this manner, we not only utilize the static shape information, but also the spatial-temporal cue. We have obtained impressive results on publicly available action datasets.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[5]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[6]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[7]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[8]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Juan Carlos Niebles,et al.  A Hierarchical Model of Shape and Appearance for Human Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.