STRONG: Spatio-Temporal Reinforcement Learning for Cross-Modal Video Moment Localization
暂无分享,去创建一个
Zheng Qin | Xiangnan He | Meng Liu | Da Cao | Meng Wang | Yawen Zeng | Meng Wang | Xiangnan He | Da Cao | Zheng Qin | Yawen Zeng | Meng Liu
[1] Shaogang Gong,et al. Deep Reinforcement Active Learning for Human-in-the-Loop Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[2] Yixin Cao,et al. Reinforced Negative Sampling over Knowledge Graph for Recommendation , 2020, WWW.
[3] Larry S. Davis,et al. MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Hongxia Jin,et al. Vision-Language Recommendation via Attribute Augmented Multimodal Reinforcement Learning , 2019, ACM Multimedia.
[5] Guangyi Xiao,et al. Social-Enhanced Attentive Group Recommendation , 2019, IEEE Transactions on Knowledge and Data Engineering.
[6] Depeng Jin,et al. Reinforced Negative Sampling for Recommendation with Exposure Data , 2019, IJCAI.
[7] Qi Tian,et al. Cross-modal Moment Localization in Videos , 2018, ACM Multimedia.
[8] Ling Shao,et al. Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval , 2019, IEEE Transactions on Image Processing.
[9] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Jin Young Choi,et al. Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Bernt Schiele,et al. Grounding Action Descriptions in Videos , 2013, TACL.
[14] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[15] Yang Yang,et al. Video-based Person Re-identification via Self-Paced Learning and Deep Reinforcement Learning Framework , 2018, ACM Multimedia.
[16] Meng Liu,et al. Attentive Moment Retrieval in Videos , 2018, SIGIR.
[17] Zhou Zhao,et al. Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos , 2019, SIGIR.
[18] Lin Ma,et al. Temporally Grounding Natural Sentence in Video , 2018, EMNLP.
[19] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[20] Svetlana Lazebnik,et al. Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[21] Jiebo Luo,et al. Localizing Natural Language in Videos , 2019, AAAI.
[22] Qi Tian,et al. Video-Based Cross-Modal Recipe Retrieval , 2019, ACM Multimedia.
[23] Liang Wang,et al. Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Zi Huang,et al. Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation , 2019, ACM Multimedia.
[26] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[27] Tong Xu,et al. Latency Aware Adaptive Video Streaming using Ensemble Deep Reinforcement Learning , 2019, ACM Multimedia.
[28] Bin Jiang,et al. Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention , 2019, ICMR.
[29] Kate Saenko,et al. Multilevel Language and Vision Integration for Text-to-Clip Retrieval , 2018, AAAI.
[30] Amit K. Roy-Chowdhury,et al. Weakly Supervised Video Moment Retrieval From Text Queries , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Marco Winckler,et al. User-Adaptive Editing for 360 degree Video Streaming with Deep Reinforcement Learning , 2019, ACM Multimedia.
[32] Xiao Liu,et al. Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos , 2019, AAAI.
[33] Chao Yang,et al. Attentive Group Recommendation , 2018, SIGIR.
[34] Bernt Schiele,et al. Script Data for Attribute-Based Recognition of Composite Activities , 2012, ECCV.
[35] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[36] Trevor Darrell,et al. Localizing Moments in Video with Temporal Language , 2018, EMNLP.
[37] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[38] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Ankit Shah,et al. Natural Language Person Search Using Deep Reinforcement Learning , 2018, ArXiv.
[41] Ramakant Nevatia,et al. TALL: Temporal Activity Localization via Language Query , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[42] Zhe Gan,et al. Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation , 2018, AAAI.
[43] Jiebo Luo,et al. Exploiting Temporal Relationships in Video Moment Localization with Natural Language , 2019, ACM Multimedia.