论文信息 - Spatio-temporal information for human action recognition

Spatio-temporal information for human action recognition

Human activity recognition in videos is important for content-based videos indexing, intelligent monitoring, human-machine interaction, and virtual reality. This paper uses the low-level feature-based framework for human activity recognition which includes feature extraction and descriptor computing, early multi-feature fusion, video representation, and classification. This paper improves the first two steps. We propose a spatio-temporal bigraph-based multi-feature fusion algorithm to capture the useful visual information for recognition. Meanwhile, we introduce a compressed spatio-temporal video representation to bag of words representation. Our experiments on two popular datasets show efficient performance.

Li Yao | Yunjian Liu | Shihui Huang

[1] Jiebo Luo,et al. Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Chunheng Wang,et al. Action recognition via structured codebook construction , 2014, Signal Process. Image Commun..

[3] Fei-Fei Li,et al. Combining the Right Features for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[4] Andrew Gilbert,et al. Fast realistic multi-action recognition using mined dense spatio-temporal features , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5] Nazli Ikizler-Cinbis,et al. Object, Scene and Actions: Combining Multiple Features for Human Action Recognition , 2010, ECCV.

[6] Wen Gao,et al. Action Recognition in Broadcast Tennis Video , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[7] Shuang Wu,et al. Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Cordelia Schmid,et al. Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Shaogang Gong,et al. Recognising action as clouds of space-time interest points , 2009, CVPR.

[10] Cordelia Schmid,et al. Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[11] Dong Han,et al. Selection and context for action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12] Inderjit S. Dhillon,et al. A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[13] Dong Liu,et al. Discovering joint audio–visual codewords for video event detection , 2013, Machine Vision and Applications.

[14] Greg Mori,et al. Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Ying Wu,et al. Action recognition with multiscale spatio-temporal contexts , 2011, CVPR 2011.

[16] Juan Carlos Niebles,et al. Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.

[17] Adriana Kovashka,et al. Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18] Barbara Caputo,et al. Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[19] Ramesh C. Jain,et al. Recursive identification of gesture inputs using hidden Markov models , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[20] Cordelia Schmid,et al. Actions in context , 2009, CVPR.

[21] Ling Shao,et al. A local descriptor based on Laplacian pyramid coding for action recognition , 2013, Pattern Recognit. Lett..

[22] Sanjay Garg,et al. Human action recognition using fusion of features for unconstrained video sequences , 2016, Comput. Electr. Eng..

[23] Silvio Savarese,et al. Recognizing human actions by attributes , 2011, CVPR 2011.

[24] Barbara Caputo,et al. Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[25] Chunfeng Yuan,et al. Multi-task Sparse Learning with Beta Process Prior for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Jianfei Cai,et al. Compact Representation for Image Classification: To Choose or to Compress? , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Shih-Fu Chang,et al. Short-term audio-visual atoms for generic video concept classification , 2009, ACM Multimedia.

[28] Shiliang Sun,et al. A review of optimization methodologies in support vector machines , 2011, Neurocomputing.

[29] Ying Wu,et al. Discriminative subvolume search for efficient action detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Jason J. Corso,et al. Action bank: A high-level representation of activity in video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Fei-Fei Li,et al. Learning latent temporal structure for complex event detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[33] Ying Wu,et al. Discriminative subvolume search for efficient action detection , 2009, CVPR.

[34] Jintao Li,et al. Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[35] Alexandros Iosifidis,et al. Discriminant Bag of Words based representation for human action recognition , 2014, Pattern Recognit. Lett..

[36] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Tinne Tuytelaars,et al. Modeling video evolution for action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Alexander C. Loui,et al. Audio-visual grouplet: temporal audio-visual interactions for general video concept classification , 2011, ACM Multimedia.