Multi-scale 2D Representation Learning for weakly-supervised moment retrieval
暂无分享,去创建一个
Wensheng Zhang | Zhizhong Zhang | Yongqiang Tang | Ding Li | Rui Wu | Wensheng Zhang | Rui Wu | Ding Li | Zhizhong Zhang | Yongqiang Tang
[1] Liang Wang,et al. Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Lei Zhang,et al. AutoLoc: Weakly-supervised Temporal Action Localization , 2018, ECCV.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Yu-Gang Jiang,et al. Semantic Proposal for Activity Localization in Videos via Sentence Query , 2019, AAAI.
[5] Larry S. Davis,et al. WSLLN:Weakly Supervised Natural Language Localization Networks , 2019, EMNLP.
[6] Ming Yang,et al. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation , 2018, ECCV.
[7] Zhijie Lin,et al. Weakly-Supervised Video Moment Retrieval via Semantic Completion Network , 2020, AAAI.
[8] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[9] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[10] Luc Van Gool,et al. Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).
[11] Kate Saenko,et al. LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval , 2019, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[12] Jiebo Luo,et al. Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language , 2019, AAAI.
[13] Youngjung Uh,et al. Background Suppression Network for Weakly-supervised Temporal Action Localization , 2020, ArXiv.
[14] Ramakant Nevatia,et al. TALL: Temporal Activity Localization via Language Query , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[15] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[16] Rongrong Ji,et al. Fast Learning of Temporal Action Proposal via Dense Boundary Generator , 2019, AAAI.
[17] Runhao Zeng,et al. Graph Convolutional Networks for Temporal Action Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Chuang Gan,et al. Weakly Supervised Dense Event Captioning in Videos , 2018, NeurIPS.
[19] Kate Saenko,et al. Multilevel Language and Vision Integration for Text-to-Clip Retrieval , 2018, AAAI.
[20] Ah Chung Tsoi,et al. The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.
[21] Hongdong Li,et al. Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[22] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[23] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[24] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[25] Amit K. Roy-Chowdhury,et al. Weakly Supervised Video Moment Retrieval From Text Queries , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[27] Jingwen Wang,et al. Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction , 2020, AAAI.