Combining Orientation Tensors for Human Action Recognition

This paper presents a new tensor motion descriptor based on histogram of oriented gradients. We model the temporal evolution of gradient distribution with orientation tensors in equally sized blocks throughout the video sequence. Subsequently, these blocks are concatenated to create the final descriptor. Using a SVM classifier, even without any bag-of-feature based approach, our method achieves recognition rates greater than those found by other HOG techniques on KTH dataset and a competitive recognition rate for UCF11 and Hollywood2 datasets.

[1]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[2]  Frédéric Precioso,et al.  A Tensor Based on Optical Flow for Global Description of Motion in Videos , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[3]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Marcelo Bernardes Vieira,et al.  Indexation des Bases Vidéos à l'aide d'une Modélisation du Flot Optique par Bases de Polynômes , 2012 .

[5]  Sridha Sridharan,et al.  Spatio Temporal Feature Evaluation for Action Recognition , 2012, 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA).

[6]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Juan Liu,et al.  Will scene information help realistic action recognition? , 2012, Proceedings of the 10th World Congress on Intelligent Control and Automation.

[9]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Björn Johansson,et al.  A Theoretical Comparison of Different Orientation Tensors , 2002 .

[12]  Marcelo Bernardes Vieira,et al.  Combining gradient histograms using orientation tensors for human action recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[13]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[15]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[16]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  Matthieu Cord,et al.  RETIN: A Content-Based Image Indexing and Retrieval System , 2001, Pattern Analysis & Applications.

[18]  David Picard,et al.  Using spatial pyramids with compacted VLAT for image categorization , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[19]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[20]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[21]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.