Temporal Gaussian Mixture Layer for Videos
暂无分享,去创建一个
[1] Shih-Fu Chang,et al. ConvNet Architecture Search for Spatiotemporal Feature Learning , 2017, ArXiv.
[2] Ivan Laptev,et al. Learnable pooling with Context Gating for video classification , 2017, ArXiv.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[6] Horst Bischof,et al. A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.
[7] Michael S. Ryoo,et al. Learning Latent Super-Events to Detect Multiple Activities in Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[8] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Ali Farhadi,et al. Asynchronous Temporal Fields for Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Shih-Fu Chang,et al. CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Georg Heigold,et al. A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] J.K. Aggarwal,et al. Human activity analysis , 2011, ACM Comput. Surv..
[17] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[18] Cordelia Schmid,et al. Long-Term Temporal Convolutions for Action Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[19] Michael S. Ryoo,et al. Fine-Grained Activity Recognition in Baseball Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[20] Li Fei-Fei,et al. Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos , 2015, International Journal of Computer Vision.
[21] Bernard Ghanem,et al. DAPs: Deep Action Proposals for Action Understanding , 2016, ECCV.
[22] Kate Saenko,et al. R-C3D: Region Convolutional 3D Network for Temporal Activity Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[24] Shih-Fu Chang,et al. Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[26] Deva Ramanan,et al. Predictive-Corrective Networks for Action Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Yifan Xu,et al. SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.
[29] Lorenzo Torresani,et al. C3D: Generic Features for Video Analysis , 2014, ArXiv.
[30] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31] Michael S. Ryoo,et al. Title Learning Latent Subevents in Activity Videos Using Temporal Attention Filters , 2016, AAAI.
[32] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[33] Larry H. Matthies,et al. Pooled motion features for first-person videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Yutaka Satoh,et al. Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[35] Michael S. Ryoo,et al. Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters , 2016, AAAI 2017.
[36] Rahul Sukthankar,et al. Rethinking the Faster R-CNN Architecture for Temporal Action Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).