A study of relative motion point trajectories for action recognition

Trajectories extracted by previous methods for human action recognition contain irrelevant changes, and the Orientation-Magnitude descriptors of their shapes lack the robustness to camera motion. To solve these problems, action recognition by tracking salient relative motion points is proposed in this paper. Firstly, motion boundary detector which suppresses the camera constant motion is utilized to extract motion features. After processing the detected boundaries by the adaptive threshold, the super-pixels that contain salient points are defined as relative motion regions. Then tracking the points within super-pixels is to generate trajectories. For the trajectory shape, the pre-defined orientation assignments with coarse-to-fine quantization levels are used to produce orientation statistics. Finally, the descriptors of oriented gradient, motion boundary, oriented statistic and their combination are adopted to represent action videos, respectively. On the benchmark KTH and UCF-sports action datasets, experimental results show that the extracted trajectories can describe the movement process of object. Compared with the conventional algorithms, our method with multiple kernel learning obtains good performance.

[1]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[2]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[3]  D. Cremers,et al.  Duality TV-L1 flow with fundamental matrix prior , 2008, 2008 23rd International Conference Image and Vision Computing New Zealand.

[4]  Keiji Yanai,et al.  A Dense SURF and Triangulation Based Spatio-temporal Feature for Action Recognition , 2014, MMM.

[5]  Deva Ramanan,et al.  Histograms of Sparse Codes for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[8]  Yang Yi,et al.  Human action recognition with salient trajectories , 2013, Signal Process..

[9]  Thomas B. Moeslund,et al.  Selective spatio-temporal interest points , 2012, Comput. Vis. Image Underst..

[10]  Ling Shao,et al.  Feature detector and descriptor evaluation in human action recognition , 2010, CIVR '10.

[11]  Shaogang Gong,et al.  Discriminative Topics Modelling for Action Feature Selection and Recognition , 2010, BMVC.

[12]  Zhenyang Wu,et al.  Realistic human action recognition by Fast HOG3D and self-organization feature map , 2014, Machine Vision and Applications.

[13]  Yun Fu,et al.  Sparse Coding on Local Spatial-Temporal Volumes for Human Action Recognition , 2010, ACCV.

[14]  Patrick Bouthemy,et al.  Better Exploiting Motion for Better Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Chokri Ben Amar,et al.  Graph-based approach for human action recognition using spatio-temporal features , 2014, J. Vis. Commun. Image Represent..

[16]  Tieniu Tan,et al.  A compact optical flowbased motion representation for real-time action recognition in surveillance scenes , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[17]  Yang Wang,et al.  Discriminative figure-centric models for joint action localization and recognition , 2011, 2011 International Conference on Computer Vision.

[18]  Iasonas Kokkinos,et al.  Discovering discriminative action parts from mid-level video representations , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Witold Pedrycz,et al.  Weighted feature trajectories and concatenated bag-of-features for action recognition , 2014, Neurocomputing.

[20]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Xiaochun Cao,et al.  Action recognition using 3D DAISY descriptor , 2013, Machine Vision and Applications.

[22]  Shan Gao,et al.  Performance evaluation of early and late fusion methods for generic semantics indexing , 2013, Pattern Analysis and Applications.

[23]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.