Generation and Comprehension of Unambiguous Object Descriptions
暂无分享,去创建一个
Alan L. Yuille | Kevin Murphy | Junhua Mao | Jonathan Huang | Alexander Toshev | Oana Camburu | Alexander Toshev | A. Yuille | K. Murphy | Junhua Mao | Jonathan Huang | Oana-Maria Camburu
[1] Terry Winograd,et al. Understanding natural language , 1974 .
[2] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[3] Emiel Krahmer,et al. Efficient context-sensitive generation of referring expressions , 2002 .
[4] Kees van Deemter,et al. Information sharing : reference and presupposition in language generation and interpretation , 2002 .
[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[6] Siobhan Chapman. Logic and Conversation , 2005 .
[7] Ielka van der Sluis,et al. Building a Semantically Transparent Corpus for the Generation of Referring Expressions. , 2006, INLG.
[8] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.
[9] Robert Dale,et al. The Use of Spatial Relations in Referring Expression Generation , 2008, INLG.
[10] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[11] Kees van Deemter,et al. Natural Reference to Objects in a Visual Domain , 2010, INLG.
[12] Hugo Jair Escalante,et al. The segmented and annotated IAPR TC-12 benchmark , 2010, Comput. Vis. Image Underst..
[13] Dan Klein,et al. A Game-Theoretic Approach to Generating Spatial Descriptions , 2010, EMNLP.
[14] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[15] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[16] Yiannis Aloimonos,et al. Corpus-Guided Sentence Generation of Natural Images , 2011, EMNLP.
[17] Vicente Ordonez,et al. Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.
[18] Tamara L. Berg,et al. Baby Talk: Understanding and Generating Image Descriptions , 2011 .
[19] Ali Farhadi,et al. Recognition using visual phrases , 2011, CVPR 2011.
[20] Yejin Choi,et al. Composing Simple Image Descriptions using Web-scale N-grams , 2011, CoNLL.
[21] Tsuhan Chen,et al. Image description with a goal: Building efficient discriminating expressions for images , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] Emiel Krahmer,et al. Computational Generation of Referring Expressions: A Survey , 2012, CL.
[24] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[25] Kees van Deemter,et al. Generating Expressions that Refer to Visible Objects , 2013, NAACL.
[26] Luke S. Zettlemoyer,et al. Learning Distributions over Logical Forms for Referring Expression Generation , 2013, EMNLP.
[27] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[28] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[29] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[30] Dumitru Erhan,et al. Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[32] Vicente Ordonez,et al. ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.
[33] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[34] Vibhav Vineet,et al. ImageSpirit: Verbal Guided Image Parsing , 2013, ACM Trans. Graph..
[35] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.
[36] Donald Geman,et al. Visual Turing test for computer vision systems , 2015, Proceedings of the National Academy of Sciences.
[37] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[38] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, International Journal of Computer Vision.
[39] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[42] Saurabh Gupta,et al. Exploring Nearest Neighbor Approaches for Image Captioning , 2015, ArXiv.
[43] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[44] Jeffrey Mark Siskind,et al. Robot Language Learning, Generation, and Comprehension , 2015, ArXiv.
[45] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[46] Lisa Anne Hendricks,et al. Long-term recurrent convolutional networks for visual recognition and description , 2015, CVPR.
[47] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[48] Xinlei Chen,et al. Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Noah D. Goodman,et al. Probabilistic Semantics and Pragmatics: Uncertainty in Language and Thought , 2015 .
[50] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[51] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Dimitra Gkatzia,et al. From the Virtual to the RealWorld: Referring to Objects in Real-World Spatial Scenes , 2015, EMNLP.
[53] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[54] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[56] Trevor Darrell,et al. Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.