Focused Attention for Action Recognition
暂无分享,去创建一个
Cordelia Schmid | Karteek Alahari | Vladyslav Sydorov | C. Schmid | Alahari Karteek | Vladyslav Sydorov
[1] Yunbo Wang,et al. Eidetic 3D LSTM: A Model for Video Prediction and Beyond , 2019, ICLR.
[2] Limin Wang,et al. Temporal Segment Networks for Action Recognition in Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[4] Bolei Zhou,et al. Moments in Time Dataset: One Million Videos for Event Understanding , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[6] Deva Ramanan,et al. Attentional Pooling for Action Recognition , 2017, NIPS.
[7] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[8] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[9] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Song Han,et al. Temporal Shift Module for Efficient Video Understanding , 2018, ArXiv.
[11] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[12] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[13] Dong Liu,et al. Two-Stream Oriented Video Super-Resolution for Action Recognition , 2019, ArXiv.
[14] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[15] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[16] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[18] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[19] Cordelia Schmid,et al. Actor-Centric Relation Network , 2018, ECCV.
[20] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Cordelia Schmid,et al. PoTion: Pose MoTion Representation for Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Franccois Fleuret,et al. Processing Megapixel Images with Deep Attention-Sampling Models , 2019, ICML.
[23] Cordelia Schmid,et al. Long-Term Temporal Convolutions for Action Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[25] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Oswald Lanz,et al. Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition , 2018, BMVC.
[27] Cees Snoek,et al. VideoLSTM convolves, attends and flows for action recognition , 2016, Comput. Vis. Image Underst..
[28] Pascal Vincent,et al. Visualizing Higher-Layer Features of a Deep Network , 2009 .
[29] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[30] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[32] Xiao Liu,et al. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[33] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Thomas Brox,et al. Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[35] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[37] Ali Farhadi,et al. Asynchronous Temporal Fields for Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Cordelia Schmid,et al. Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos , 2018, ArXiv.