A joint model for action localization and classification in untrimmed video with visual attention
暂无分享,去创建一个
Wenmin Wang | Ge Li | Jinzhuo Wang | Xiongtao Chen | Weimian Li | Ge Li | Wenmin Wang | Jinzhuo Wang | Xiongtao Chen | Weimian Li
[1] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[2] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Xiaoou Tang,et al. Action Recognition and Detection by Combining Motion and Appearance Features , 2014 .
[4] Limin Wang,et al. Action recognition with trajectory-pooled deep-convolutional descriptors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[6] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Koray Kavukcuoglu,et al. Visual Attention , 2020, Computational Models for Cognitive Vision.
[8] Ivan Laptev,et al. On Space-Time Interest Points , 2005, International Journal of Computer Vision.
[9] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[10] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[11] Luc Van Gool,et al. An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.
[12] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[14] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.
[15] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[16] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[17] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.
[18] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Pierre Sermanet,et al. Attention for Fine-Grained Categorization , 2014, ICLR.
[20] Wen Gao,et al. Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition , 2016, NIPS.
[21] Fabio Cuzzolin,et al. Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge , 2016, ArXiv.