Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings
暂无分享,去创建一个
[1] Jiaxuan Wang,et al. HICO: A Benchmark for Recognizing Human-Object Interactions in Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[2] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[3] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[4] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Adam Kilgarriff,et al. SENSEVAL: an exercise in evaluating world sense disambiguation programs , 1998, LREC.
[6] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[7] Krystian Mikolajczyk,et al. Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Hinrich Schütze,et al. AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.
[9] Julie Weeds,et al. Finding Predominant Word Senses in Untagged Text , 2004, ACL.
[10] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[11] Chris Dyer,et al. Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models , 2015, NAACL.
[12] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Trevor Darrell,et al. Unsupervised Learning of Visual Sense Models for Polysemous Words , 2008, NIPS.
[14] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.
[15] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[16] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[17] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[18] Dekang Lin,et al. Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.
[19] George A. Miller,et al. Introduction to WordNet: An On-line Lexical Database , 1990 .
[20] Michael E. Lesk,et al. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.
[21] Roberto Navigli,et al. Word sense disambiguation: A survey , 2009, CSUR.
[22] Pietro Perona,et al. Describing Common Human Visual Actions in Images , 2015, BMVC.
[23] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[24] Leonidas J. Guibas,et al. Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.
[25] David A. Forsyth,et al. Discriminating Image Senses by Clustering with Multimodal Features , 2006, ACL.
[26] Mitchell P. Marcus,et al. OntoNotes: The 90% Solution , 2006, NAACL.
[27] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Mirella Lapata,et al. Good Neighbors Make Good Senses: Exploiting Distributional Similarity for Unsupervised WSD , 2008, COLING.
[29] Svetlana Lazebnik,et al. Improving Image-Sentence Embeddings Using Large Weakly Annotated Photo Collections , 2014, ECCV.
[30] Kobus Barnard,et al. Word Sense Disambiguation with Pictures , 2003, Artif. Intell..
[31] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[32] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[33] Jeff A. Bilmes,et al. On Deep Multi-View Representation Learning , 2015, ICML.
[34] Hwee Tou Ng,et al. It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.
[35] Fei-Fei Li,et al. Grouplet: A structured image representation for recognizing human and object interactions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[36] Raffaella Bernardi,et al. TUHOI: Trento Universal Human Object Interaction Dataset , 2014, VL@COLING.
[37] Beth Levin,et al. English Verb Classes and Alternations: A Preliminary Investigation , 1993 .
[38] Xinlei Chen,et al. Sense discovery via co-clustering on images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Raffaella Bernardi,et al. Exploiting language models to recognize unseen actions , 2013, ICMR '13.