暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Shuicheng Yan,et al. A2-Nets: Double Attention Networks , 2018, NeurIPS.
[3] Luc Van Gool,et al. stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition , 2020, IEEE Transactions on Circuits and Systems for Video Technology.
[4] Jiebo Luo,et al. Action Recognition With Spatio–Temporal Visual Attention on Skeleton Image Sequences , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[5] Richard P. Wildes,et al. Temporal Residual Networks for Dynamic Scene Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Qiang Ji,et al. Action recognition and localization with spatial and temporal contexts , 2019, Neurocomputing.
[7] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Kunfeng Wang,et al. A recurrent attention and interaction model for pedestrian trajectory prediction , 2020, IEEE/CAA Journal of Automatica Sinica.
[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[10] Shih-Fu Chang,et al. ConvNet Architecture Search for Spatiotemporal Feature Learning , 2017, ArXiv.
[11] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[12] Chao Li,et al. Collaborative Spatiotemporal Feature Learning for Video Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Davide Modolo,et al. Action Recognition With Spatial-Temporal Discriminative Filter Banks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Hichem Snoussi,et al. Exploring a rich spatial-temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN , 2020, Neurocomputing.
[16] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[17] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[18] Dezhong Peng,et al. Global-Attention-Based Neural Networks for Vision Language Intelligence , 2021, IEEE/CAA Journal of Automatica Sinica.
[19] Yunchao Wei,et al. CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[22] Limin Wang,et al. Temporal Segment Networks for Action Recognition in Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Jin Guo,et al. A spatial-temporal attention model for human trajectory prediction , 2020, IEEE/CAA Journal of Automatica Sinica.
[25] Qiuqi Ruan,et al. Spatial-temporal pyramid based Convolutional Neural Network for action recognition , 2019, Neurocomputing.
[26] Christopher Joseph Pal,et al. Delving Deeper into Convolutional Networks for Learning Video Representations , 2015, ICLR.
[27] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[28] Thomas H. Li,et al. Spatial–Temporal Context-Aware Online Action Detection and Prediction , 2020, IEEE Transactions on Circuits and Systems for Video Technology.
[29] Stephen Lin,et al. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[30] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[31] Yiming Hu,et al. Convolutional relation network for skeleton-based action recognition , 2019, Neurocomputing.
[32] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[34] Yuxin Peng,et al. Two-Stream Collaborative Learning With Spatial-Temporal Attention for Video Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.
[35] Luc Van Gool,et al. Spatio-Temporal Channel Correlation Networks for Action Classification , 2018, ECCV.
[36] Shuicheng Yan,et al. Multi-Fiber Networks for Video Recognition , 2018, ECCV.
[37] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[38] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[39] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[41] Chen Zhu,et al. Vision Based Hand Gesture Recognition Using 3D Shape Context , 2018, 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO).
[42] Reza Safabakhsh,et al. Correlational Convolutional LSTM for human action recognition , 2020, Neurocomputing.
[43] Errui Ding,et al. Compact Generalized Non-local Network , 2018, NeurIPS.
[44] Yi Yang,et al. FASTER Recurrent Networks for Efficient Video Classification , 2019, AAAI.
[45] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[46] Lingfeng Wang,et al. Weakly Semantic Guided Action Recognition , 2019, IEEE Transactions on Multimedia.
[47] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[48] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[50] Xiaoyan Sun,et al. Temporal–Spatial Mapping for Action Recognition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[51] Andrew Zisserman,et al. Video Action Transformer Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Тараса Шевченка,et al. Quo vadis? , 2013, Clinical chemistry.
[53] Hong Liu,et al. Expectation-Maximization Attention Networks for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[54] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[55] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.