EURECOM participation in TrecVid VTT 2018
暂无分享,去创建一个
[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[2] Yale Song,et al. TGIF: A New Dataset and Benchmark on Animated GIF Description , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Georges Quénot,et al. TRECVID 2017: Evaluating Ad-hoc and Instance Video Search, Events Detection, Video Captioning and Hyperlinking , 2017, TRECVID.
[4] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[5] Jianfeng Dong,et al. DL-61-86 at TRECVID 2017: Video-to-Text Description , 2017, TRECVID.
[6] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[8] William B. Dolan,et al. Collecting Highly Parallel Data for Paraphrase Evaluation , 2011, ACL.
[9] Jonathan G. Fiscus,et al. TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Matching, Video Storytelling Linking and Video Search , 2018, TRECVID.
[10] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[13] David J. Fleet,et al. VSE++: Improved Visual-Semantic Embeddings , 2017, ArXiv.
[14] Bernard Mérialdo,et al. Gated Recurrent Capsules for Visual Word Embeddings , 2019, MMM.