论文信息 - Video Segmentation Descriptors for Event Recognition

Video Segmentation Descriptors for Event Recognition

This paper presents a new video motion descriptor based on a multi-scale video segmentation to provide a multi-layered output as well as connections with the rich interactions that occur between objects at the semantic level. We also put the emphasis on relationships between motion clusters by providing a new relative motion descriptor encapsulating relative motion patterns within a local spatio-temporal neighborhood. Experimental results on the challenging TRECVID MED11 event recognition dataset validate the approach.

Ramakant Nevatia | Rémi Trichet | R. Nevatia | Rémi Trichet

[1] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2] Horst Bischof,et al. Motion estimation with non-local total variation regularization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[4] Bernt Schiele,et al. Video Segmentation with Superpixels , 2012, ACCV.

[5] Mubarak Shah,et al. Recognizing Complex Events Using Large Margin Joint Low-Level Event Model , 2012, ECCV.

[6] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[7] Cordelia Schmid,et al. Actions in context , 2009, CVPR.

[8] Stefano Soatto,et al. Tracklet Descriptors for Action Modeling and Video Analysis , 2010, ECCV.

[9] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.

[11] Alexander G. Hauptmann,et al. MoSIFT: Recognizing Human Actions in Surveillance Videos , 2009 .

[12] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[13] Cordelia Schmid,et al. Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[14] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[15] Françoise J. Prêteux,et al. Trajectory signature for action recognition in video , 2012, ACM Multimedia.

[16] Prosenjit Bose,et al. Global Context Descriptors for SURF and MSER Feature Descriptors , 2010, 2010 Canadian Conference on Computer and Robot Vision.

[17] Lior Wolf,et al. Local Trinary Patterns for human action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18] Martial Hebert,et al. Representing Pairwise Spatial and Temporal Relations for Action Recognition , 2010, ECCV.

[19] Chong-Wah Ngo,et al. Trajectory-Based Modeling of Human Actions with Motion Reference Points , 2012, ECCV.

[20] Sven J. Dickinson,et al. Optimal Image and Video Closure by Superpixel Grouping , 2012, International Journal of Computer Vision.

[21] John W. Fisher,et al. A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Sergio A. Velastin,et al. 3D Extended Histogram of Oriented Gradients (3DHOG) for Classification of Road Users in Urban Scenes , 2009, BMVC.

[23] Cordelia Schmid,et al. Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Ramakant Nevatia,et al. Video segmentation with spatio-temporal tubes , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[25] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[26] Bodo Rosenhahn,et al. Temporally Consistent Superpixels , 2013, 2013 IEEE International Conference on Computer Vision.

[27] Jintao Li,et al. Hierarchical spatio-temporal context modeling for action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Linda G. Shapiro,et al. A SIFT descriptor with global context , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[30] Keiji Yanai,et al. A SURF-Based Spatio-Temporal Feature for Feature-Fusion-Based Action Recognition , 2010, ECCV Workshops.

[31] Christopher Joseph Pal,et al. Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33] Lizhuang Ma,et al. A new framework for feature descriptor based on SIFT , 2009, Pattern Recognit. Lett..