Action recognition based on object tracking and dense trajectories

For recognizing human actions in video sequences, it is necessary to extract sufficient information which can represent motion features. In recent years, dense trajectories based action recognition algorithms attract more attention for containing rich spatio-temporal information. However, these algorithms are always faced with cluttered background. To solve this problem, we involve object tracking process in dense trajectories, with detecting object location through online weighted multiple instance learning object tracking and calculating dense trajectories in human bounding box, which suppresses redundant background information effectively. Meanwhile, human actions are classified by the bag-of-words model and SVM approach, which are robust to slight drift problem in tracking process. Involving object tracking improves efficiency as well as accuracy of dense trajectories. Our algorithm achieves superior results on the KTH and UCF YouTube datasets compared to the state-of-the-art methods, especially outstanding 89.0% accuracy on cluttered background dataset UCF YouTube.

[1]  T. Poggio,et al.  Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[2]  Yunxue Shao,et al.  Human Action Recognition using Salient Region Detection in Complex Scenes , 2015 .

[3]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[4]  Florian Baumann,et al.  Recognizing human actions using novel space-time volume binary patterns , 2016, Neurocomputing.

[5]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[6]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[7]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[8]  Tsuhan Chen,et al.  Spatio-Temporal Phrases for Activity Recognition , 2012, ECCV.

[9]  Yi Yang,et al.  Space-Time Robust Representation for Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Tiejun Huang,et al.  Detecting Rare Actions and Events from Surveillance Big Data with Bag of Dynamic Trajectories , 2015, 2015 IEEE International Conference on Multimedia Big Data.

[11]  Hyung Jin Chang,et al.  Robust action recognition using local motion and group sparsity , 2014, Pattern Recognit..

[12]  Thomas B. Moeslund,et al.  Selective spatio-temporal interest points , 2012, Comput. Vis. Image Underst..

[13]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[15]  Piji Li,et al.  Actions in Still Web Images: Visualization, Detection and Retrieval , 2011, WAIM.

[16]  Kaihua Zhang,et al.  Real-time visual tracking via online weighted multiple instance learning , 2013, Pattern Recognit..