TEINet: Towards an Efficient Architecture for Video Recognition
暂无分享,去创建一个
Feiyue Huang | Ying Tai | Tong Lu | Jilin Li | Chengjie Wang | Limin Wang | Yabiao Wang | Zhaoyang Liu | Donghao Luo | Limin Wang | Tong Lu | Ying Tai | Yabiao Wang | Donghao Luo | Chengjie Wang | Jilin Li | Feiyue Huang | Zhaoyang Liu
[1] Chong-Wah Ngo,et al. Learning Spatio-Temporal Representation With Local and Global Diffusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[3] Song Han,et al. Temporal Shift Module for Efficient Video Understanding , 2018, ArXiv.
[4] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Limin Wang,et al. Temporal Segment Networks for Action Recognition in Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[7] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[9] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[10] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[11] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[12] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[13] Xiao Liu,et al. StNet: Local and Global Spatial-Temporal Modeling for Action Recognition , 2018, AAAI.
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[16] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Limin Wang,et al. Appearance-and-Relation Networks for Video Classification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[19] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[20] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[21] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Thomas Brox,et al. ECO: Efficient Convolutional Network for Online Video Understanding , 2018, ECCV.
[23] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[27] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[28] Bowen Zhang,et al. Real-Time Action Recognition with Enhanced Motion Vector CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Deva Ramanan,et al. Attentional Pooling for Action Recognition , 2017, NIPS.
[30] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[31] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[32] Xiang Li,et al. Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks , 2019, ArXiv.
[33] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.