Unified Spatio-Temporal Attention Networks for Action Recognition in Videos
暂无分享,去创建一个
Ling-Yu Duan | Tao Mei | Yong Rui | Dong Li | Ting Yao | Y. Rui | Tao Mei | Ling-yu Duan | Ting Yao | Dong Li
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] Pong C. Yuen,et al. Reduced Analytic Dependency Modeling: Robust Fusion for Visual Recognition , 2014, International Journal of Computer Vision.
[3] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Xiao-Ping Zhang,et al. A Hierarchical Spatio-Temporal Model for Human Activity Recognition , 2017, IEEE Transactions on Multimedia.
[5] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[6] Andrea Vedaldi,et al. Dynamic Image Networks for Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[8] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[9] Luc Van Gool,et al. An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.
[10] 乔宇. Motionlets: Mid-Level 3D Parts for Human Motion Recognition , 2013 .
[11] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[13] Cordelia Schmid,et al. The LEAR submission at Thumos 2014 , 2014 .
[14] Cordelia Schmid,et al. Long-Term Temporal Convolutions for Action Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Bowen Zhang,et al. Real-Time Action Recognition with Enhanced Motion Vector CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Tao Mei,et al. MSR Asia MSM at THUMOS Challenge 2015 , 2015 .
[17] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Ramakant Nevatia,et al. Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images , 2015, ACM Multimedia.
[19] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[20] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[21] Jun Wang,et al. Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.
[22] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[23] Xi Wang,et al. Evaluating Two-Stream CNN for Video Classification , 2015, ICMR.
[24] Tao Mei,et al. Deep Quantization: Encoding Convolutional Activations with Deep Generative Model , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[26] Richard P. Wildes,et al. Spatiotemporal Multiplier Networks for Video Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Yu Qiao,et al. Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos , 2018, IEEE Transactions on Image Processing.
[28] Shih-Fu Chang,et al. Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Patrick Bouthemy,et al. Better Exploiting Motion for Better Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[30] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.
[31] Xiaoou Tang,et al. Action Recognition and Detection by Combining Motion and Appearance Features , 2014 .
[32] Baoxin Li,et al. Hierarchical Attention Network for Action Recognition in Videos , 2016, ArXiv.
[33] Ali Farhadi,et al. Asynchronous Temporal Fields for Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Jeremy S. Smith,et al. Hierarchical Multi-scale Attention Networks for action recognition , 2017, Signal Process. Image Commun..
[35] Tao Mei,et al. Learning hierarchical video representation for action recognition , 2017, International Journal of Multimedia Information Retrieval.
[36] Horst Bischof,et al. A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.
[37] Tao Mei,et al. Learning Deep Intrinsic Video Representation by Exploring Temporal Coherence and Graph Structure , 2016, IJCAI.
[38] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.
[39] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[40] Silvio Savarese,et al. Action Recognition by Hierarchical Mid-Level Action Elements , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[41] Lin Sun,et al. Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[42] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[43] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[44] Wenjun Zeng,et al. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.
[45] Yi Yang,et al. A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Tao Mei,et al. Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation , 2016, ICMR.
[47] G. K. Tam,et al. Event Fisher Vectors: Robust Encoding Visual Diversity of Visual Streams , 2015 .
[48] Cees Snoek,et al. What do 15,000 object categories tell us about classifying and localizing actions? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[50] Zhong Zhou,et al. Learning Spatial and Temporal Extents of Human Actions for Action Detection , 2015, IEEE Transactions on Multimedia.
[51] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[52] Cees G. M. Snoek,et al. University of Amsterdam at THUMOS Challenge 2014 , 2014 .
[53] Nitish Srivastava,et al. Exploiting Image-trained CNN Architectures for Unconstrained Video Classification , 2015, BMVC.
[54] Amit K. Roy-Chowdhury,et al. A Continuous Learning Framework for Activity Recognition Using Deep Hybrid Feature Models , 2015, IEEE Transactions on Multimedia.
[55] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Tao Mei,et al. Recurrent Tubelet Proposal and Recognition Networks for Action Detection , 2018, ECCV.
[57] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[58] Gang Wang,et al. Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Gaurav Sharma,et al. AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Tao Mei,et al. Super Fast Event Recognition in Internet Videos , 2015, IEEE Transactions on Multimedia.
[61] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[62] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[63] Xi Wang,et al. Exploiting Objects with LSTMs for Video Categorization , 2016, ACM Multimedia.
[64] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[65] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Tao Mei,et al. MSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos , 2017 .
[67] Cees Snoek,et al. VideoLSTM convolves, attends and flows for action recognition , 2016, Comput. Vis. Image Underst..
[68] Ivan Laptev,et al. On Space-Time Interest Points , 2005, International Journal of Computer Vision.
[69] Ali Farhadi,et al. Actions ~ Transformations , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[70] Behrooz Mahasseni,et al. Regularizing Long Short Term Memory with 3D Human-Skeleton Sequences for Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).