Visual versus Textual Embedding for Video Retrieval
暂无分享,去创建一个
[1] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Kevin Gimpel,et al. Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.
[3] Stéphane Ayache,et al. Video Corpus Annotation Using Active Learning , 2008, ECIR.
[4] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[5] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] James Allan,et al. Zero-shot video retrieval using content and concepts , 2013, CIKM.
[7] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[8] Jonathan G. Fiscus,et al. TRECVID 2016: Evaluating Video Search, Video Event Detection, Localization, and Hyperlinking , 2016, TRECVID.
[9] Christoph Meinel,et al. Image Captioning with Deep Bidirectional LSTMs , 2016, ACM Multimedia.
[10] Mirella Lapata,et al. A Comparison of Vector-based Representations for Semantic Composition , 2012, EMNLP.
[11] Larry S. Davis,et al. VRFP: On-the-Fly Video Retrieval Using Web Images and Fast Fisher Vector Products , 2015, IEEE Transactions on Multimedia.
[12] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[13] Bernard Mérialdo,et al. EURECOM at TrecVid 2013: The Semantic Indexing Task , 2013, TRECVID.
[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[16] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[17] Dennis Koelma,et al. The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection , 2016, ICMR.
[18] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[20] Benoit Huet,et al. When textual and visual information join forces for multimedia retrieval , 2014, ICMR.
[21] Bernard Mérialdo,et al. Natural Language Access to Video Databases , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).
[22] Cees Snoek,et al. Composite Concept Discovery for Zero-Shot Video Event Detection , 2014, ICMR.