Action recognition based on sparse motion trajectories

We present a method that extracts effective features in videos for human action recognition. The proposed method analyses the 3D volumes along the sparse motion trajectories of a set of interest points from the video scene. To represent human actions, we generate a Bag-of-Features (BoF) model based on extracted features, and finally a support vector machine is used to classify human activities. Evaluation shows that the proposed features are discriminative and computationally efficient. Our method achieves state-of-the-art performance with the standard human action recognition benchmarks, namely KTH and Weizmann datasets.

[1]  Noel E. O'Connor,et al.  Team Activity Recognition in Sports , 2012, ECCV.

[2]  Quoc V. Le,et al.  Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[3]  Noel E. O'Connor,et al.  Automatic camera selection for activity monitoring in a multi-camera system for tennis , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[4]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[5]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[7]  Shaogang Gong,et al.  Recognising action as clouds of space-time interest points , 2009, CVPR.

[8]  Andrew Gilbert,et al.  Action Recognition Using Mined Hierarchical Compound Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Mubarak Shah,et al.  Recognizing human actions in videos acquired by uncalibrated moving cameras , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Mubarak Shah,et al.  View-invariance in action recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Alan F. Smeaton,et al.  Multispectral Object Segmentation and Retrieval in Surveillance Video , 2006, 2006 International Conference on Image Processing.

[12]  Sharath Pankanti,et al.  CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection , 2012 .

[13]  Changyin Sun,et al.  Supervised class-specific dictionary learning for sparse modeling in action recognition , 2012, Pattern Recognit..

[14]  Adriana Kovashka,et al.  Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[17]  Mubarak Shah,et al.  Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yonghong Tian,et al.  PKU-NEC @TRECVID2012 SED : Uneven-Sequence Based Event Detection in Surveillance Video , 2012, TRECVID.

[19]  D. Sculley,et al.  Web-scale k-means clustering , 2010, WWW '10.

[20]  Liangliang Cao,et al.  MediaCCNY at TRECVID 2012: Surveillance Event Detection , 2012, TRECVID.

[21]  Aaron F. Bobick,et al.  Recognition of human body motion using phase space constraints , 1995, Proceedings of IEEE International Conference on Computer Vision.

[22]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Yonghong Tian,et al.  PKU-NEC @TRECVID2011 SED: Sequence-Based Event Detection in Surveillance Video , 2011, TRECVID.

[24]  Peyman Milanfar,et al.  Action Recognition from One Example , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[26]  Yaser Sheikh,et al.  Exploring the space of a human action , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.