Cmu-ucr-bosch @ Trecvid 2017: Video to Text Retrieval
暂无分享,去创建一个
[1] Leonidas J. Guibas,et al. Joint embeddings of shapes and images via CNN image purification , 2015, ACM Trans. Graph..
[2] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[3] Christopher D. Manning,et al. Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.
[4] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[8] David J. Fleet,et al. VSE++: Improved Visual-Semantic Embeddings , 2017, ArXiv.
[9] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Amit K. Roy-Chowdhury,et al. Generating Diverse Image Datasets with Limited Labeling , 2016, ACM Multimedia.
[11] Georges Quénot,et al. TRECVID 2017: Evaluating Ad-hoc and Instance Video Search, Events Detection, Video Captioning and Hyperlinking , 2017, TRECVID.
[12] S. Shankar Sastry,et al. Dissimilarity-Based Sparse Subset Selection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.