MAC: Mining Activity Concepts for Language-Based Temporal Localization
暂无分享,去创建一个
Ramakant Nevatia | Jiyang Gao | Kan Chen | Runzhou Ge | R. Nevatia | J. Gao | Kan Chen | Runzhou Ge
[1] Bernt Schiele,et al. Grounding Action Descriptions in Videos , 2013, TACL.
[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] R. Nevatia,et al. TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[5] Ramakant Nevatia,et al. ACD: Action Concept Discovery from Image-Sentence Corpora , 2016, ICMR.
[6] Kan Chen,et al. The 2013 SESAME Multimedia Event Detection and Recounting System , 2013, TRECVID.
[7] Ramakant Nevatia,et al. Automatic Concept Discovery from Parallel Text and Visual Corpora , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[9] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[10] Ramakant Nevatia,et al. Knowledge Aided Consistency for Weakly Supervised Phrase Grounding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[12] Bernt Schiele,et al. Script Data for Attribute-Based Recognition of Composite Activities , 2012, ECCV.
[13] Trevor Darrell,et al. Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Ramakant Nevatia,et al. MSRC: multimodal spatial regression with semantic context for phrase grounding , 2017, International Journal of Multimedia Information Retrieval.
[15] Danqi Chen,et al. A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Shih-Fu Chang,et al. CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Ramakant Nevatia,et al. ACTIVE: Activity Concept Transitions in Video Event Classification , 2013, 2013 IEEE International Conference on Computer Vision.
[19] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[20] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[21] Ramakant Nevatia,et al. TALL: Temporal Activity Localization via Language Query , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[22] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, International Journal of Computer Vision.
[23] Rahul Sukthankar,et al. Rethinking the Faster R-CNN Architecture for Temporal Action Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Ramakant Nevatia,et al. Query-Guided Regression Network with Context Policy for Phrase Grounding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Yahong Han,et al. Multi-modal Circulant Fusion for Video-to-Language and Backward , 2018, IJCAI.
[27] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[29] Ramakant Nevatia,et al. Cascaded Boundary Regression for Temporal Action Detection , 2017, BMVC.
[30] Songyang Zhang,et al. Temporal Action Localization , 2018 .
[31] Ramakant Nevatia,et al. CTAP: Complementary Temporal Action Proposal Generation , 2018, ECCV.
[32] Kate Saenko,et al. R-C3D: Region Convolutional 3D Network for Temporal Activity Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[33] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[34] Shih-Fu Chang,et al. Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Meng Liu,et al. Attentive Moment Retrieval in Videos , 2018, SIGIR.
[36] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.