Spatio-temporal deformable 3D ConvNets with attention for action recognition
暂无分享,去创建一个
Xianglong Liu | Jun Li | Deqing Wang | Mingyuan Zhang | Jun Yu Li | Xianglong Liu | Mingyuan Zhang | Deqing Wang
[1] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[2] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[3] Yang Du,et al. Interaction-Aware Spatio-Temporal Pyramid Attention Networks for Action Classification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Lingfeng Wang,et al. Joint spatial-temporal attention for action recognition , 2018, Pattern Recognit. Lett..
[5] Ali Borji,et al. Revisiting Video Saliency: A Large-Scale Benchmark and a New Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[6] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Yutaka Satoh,et al. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[8] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[10] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[11] Sridha Sridharan,et al. Two Stream LSTM: A Deep Fusion Framework for Human Action Recognition , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[12] Baoxin Li,et al. Multi-stream CNN: Learning representations based on human-related regions for action recognition , 2018, Pattern Recognit..
[13] Shijian Lu,et al. TORNADO: A Spatio-Temporal Convolutional Regression Network for Video Action Proposal , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[14] Stan Sclaroff,et al. Do less and achieve more: Training CNNs for action recognition utilizing action images from the Web , 2015, Pattern Recognit..
[15] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.
[16] Junmo Kim,et al. Active Convolution: Learning the Shape of Convolution for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Li Xie,et al. Fully convolutional networks for action recognition , 2017, IET Comput. Vis..
[18] Ling-Yu Duan,et al. Unified Spatio-Temporal Attention Networks for Action Recognition in Videos , 2019, IEEE Transactions on Multimedia.
[19] Ruigang Yang,et al. Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[22] Gang Wang,et al. Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Chuang Gan,et al. Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering , 2019, AAAI.
[24] Kai Liu,et al. Tensor-based linear dynamical systems for action recognition from 3D skeletons , 2018, Pattern Recognit..
[25] Yi Li,et al. Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Ling Shao,et al. Compressive Sequential Learning for Action Similarity Labeling , 2016, IEEE Transactions on Image Processing.
[28] Xiaoshuai Sun,et al. Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length , 2018, IEEE Transactions on Multimedia.
[29] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Yunchao Wei,et al. CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[31] Richard P. Wildes,et al. Spatiotemporal Multiplier Networks for Video Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Ruigang Yang,et al. Semi-Supervised Video Object Segmentation with Super-Trajectories , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[34] Stephen Lin,et al. Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Cordelia Schmid,et al. Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.
[36] Luc Van Gool,et al. Spatio-Temporal Channel Correlation Networks for Action Classification , 2018, ECCV.
[37] Jiwen Lu,et al. Learning principal orientations and residual descriptor for action recognition , 2019, Pattern Recognit..
[38] Xiaoyan Sun,et al. MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Sergio Escalera,et al. Residual Stacked RNNs for Action Recognition , 2018, ECCV Workshops.
[40] Tiejun Huang,et al. Temporal Attentive Network for Action Recognition , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).
[41] Jean-Michel Morel,et al. A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[42] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[43] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[44] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Ling Shao,et al. Video Salient Object Detection via Fully Convolutional Networks , 2017, IEEE Transactions on Image Processing.
[46] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[47] Meng Wang,et al. Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder , 2018, IEEE Transactions on Image Processing.
[48] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[49] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[50] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[51] Xuelong Li,et al. From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[52] Limin Wang,et al. Action recognition with trajectory-pooled deep-convolutional descriptors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Bingbing Ni,et al. Binary Coding for Partial Action Analysis with Limited Observation Ratios , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Hao Yang,et al. Time-Asymmetric 3d Convolutional Neural Networks for Action Recognition , 2019, 2019 IEEE International Conference on Image Processing (ICIP).
[55] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[56] Yadong Mu,et al. Boosting Temporal Binary Coding for Large-Scale Video Search , 2017, IEEE Transactions on Multimedia.
[57] Chao Li,et al. Active Transfer Learning Network: A Unified Deep Joint Spectral–Spatial Feature Learning Model for Hyperspectral Image Classification , 2019, IEEE Transactions on Geoscience and Remote Sensing.
[58] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[59] Bingbing Ni,et al. Zero-Shot Action Recognition with Error-Correcting Output Codes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Dahua Lin,et al. Trajectory Convolution for Action Recognition , 2018, NeurIPS.