暂无分享,去创建一个
Olivier Teytaud | Sylvain Gelly | Olivier Bousquet | Karol Kurach | Philip Häusser | Damien Vincent | Michal Jastrzebski
[1] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[2] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[3] Kevin Gimpel,et al. Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.
[4] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.
[7] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..
[8] Joo-Hwee Lim,et al. Content-Based Medical Image Retrieval with Metric Learning via Rank Correlation , 2010, MLMI.
[9] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .
[11] Sanja Fidler,et al. Order-Embeddings of Images and Language , 2015, ICLR.
[12] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[15] Geoffrey Zweig,et al. Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.
[16] Angeliki Lazaridou,et al. Combining Language and Vision with a Multimodal Skip-gram Model , 2015, NAACL.
[17] Trevor Darrell,et al. Learning cross-modality similarity for multinomial data , 2011, 2011 International Conference on Computer Vision.
[18] Vicente Ordonez,et al. Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.
[19] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Alan L. Yuille,et al. Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images , 2016, NIPS.
[21] Felix Hill,et al. Learning Abstract Concept Embeddings from Multi-Modal Data: Since You Probably Can’t See What I Mean , 2014, EMNLP.
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[24] Dumitru Erhan,et al. Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.