No frame left behind: Full Video Action Recognition
暂无分享,去创建一个
Olaf Booij | Silvia L. Pintea | Jan C. van Gemert | Fatemeh Karimi Nejadasl | F. Karimi Nejadasl | Xin Liu | J. V. Gemert | O. Booij | S. Pintea | X. Liu
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Yue Wang,et al. E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings , 2019, NeurIPS.
[3] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[4] Wei Wu,et al. STM: SpatioTemporal and Motion Encoding for Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Guo-Jun Qi,et al. Differential Recurrent Neural Networks for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[6] Arnold W. M. Smeulders,et al. Timeception for Complex Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[8] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[10] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Ryan P. Adams,et al. Randomized Automatic Differentiation , 2020, ICLR.
[12] Thomas Brox,et al. ECO: Efficient Convolutional Network for Online Video Understanding , 2018, ECCV.
[13] Lorenzo Torresani,et al. SCSampler: Sampling Salient Clips From Video for Efficient Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[15] Andrea Vedaldi,et al. Transactions on Pattern Analysis and Machine Intelligence 1 Action Recognition with Dynamic Image Networks , 2022 .
[16] Cees Snoek,et al. What do 15,000 object categories tell us about classifying and localizing actions? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[18] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[19] Tianqi Chen,et al. Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.
[20] Ming Shao,et al. A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Yutaka Satoh,et al. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Viorica Patraucean,et al. Sideways: Depth-Parallel Training of Video Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Sridha Sridharan,et al. Two Stream LSTM: A Deep Fusion Framework for Human Action Recognition , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[25] Thomas Serre,et al. The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Lorenzo Torresani,et al. C3D: Generic Features for Video Analysis , 2014, ArXiv.
[28] Fadime Sener,et al. Temporal Aggregate Representations for Long-Range Video Understanding , 2020, ECCV.
[29] Sergio Escalera,et al. Gate-Shift Networks for Video Action Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Luc Van Gool,et al. Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Larry S. Davis,et al. A Coarse-to-Fine Framework for Resource Efficient Video Recognition , 2019, International Journal of Computer Vision.
[33] Suha Kwak,et al. MotionSqueeze: Neural Motion Feature Learning for Video Understanding , 2020, ECCV.
[34] Wenhao Wu,et al. Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[36] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[37] Raquel Urtasun,et al. The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.
[38] Davide Modolo,et al. Action Recognition With Spatial-Temporal Discriminative Filter Banks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[39] Ngai-Man Cheung,et al. Deep Adaptive Temporal Pooling for Activity Recognition , 2018, ACM Multimedia.
[40] Wojciech Samek,et al. Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[41] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[42] Eldad Haber,et al. Reversible Architectures for Arbitrarily Deep Residual Neural Networks , 2017, AAAI.
[43] Kaiming He,et al. A Multigrid Method for Efficiently Training Video Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Abhinav Gupta,et al. ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[46] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[48] Sung Wook Baik,et al. Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features , 2018, IEEE Access.
[49] Efstratios Gavves,et al. VideoGraph: Recognizing Minutes-Long Human Activities in Videos , 2019, ArXiv.
[50] Tinne Tuytelaars,et al. Rank Pooling for Action Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[51] Larry S. Davis,et al. AdaFrame: Adaptive Frame Selection for Fast Video Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Tor M. Aamodt,et al. ReSprop: Reuse Sparsified Backpropagation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Zhe L. Lin,et al. Best Frame Selection in a Short Video , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).