Resolving vision and language ambiguities together: Joint segmentation & prepositional attachment resolution in captioned scenes
暂无分享,去创建一个
Gordon Christie | Yash Goyal | Dhruv Batra | Kevin Kochersberger | Aishwarya Agrawal | Ankit Laddha | Stanislaw Antol
[1] Stefanie Jegelka,et al. Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets , 2014, NIPS.
[2] Daniel Tarlow,et al. Optimizing Expected Intersection-Over-Union with Candidate-Constrained CRFs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[3] Sebastian Nowozin,et al. A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[4] C. Lawrence Zitnick,et al. Bringing Semantics into Focus Using Visual Abstraction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Christopher D. Manning,et al. Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.
[6] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[9] Daniel Tarlow,et al. Empirical Minimum Bayes Risk Prediction: How to Extract an Extra Few % Performance from Vision Models with Just Three More Parameters , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Sanja Fidler,et al. A Sentence Is Worth a Thousand Pixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[11] Yair Weiss,et al. Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[12] Ashutosh Saxena,et al. Cascaded Classification Models: Combining Models for Holistic Scene Understanding , 2008, NIPS.
[13] Yang Wang,et al. Image Retrieval with Structured Object Queries Using Latent Ranking SVM , 2012, ECCV.
[14] P. R. Hawley. See no evil. , 1953, Bulletin of the American College of Surgeons.
[15] Pushmeet Kohli,et al. DivMCuts: Faster Training of Structural SVMs with Diverse M-Best Cutting-Planes , 2013, AISTATS.
[16] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Frank Keller,et al. Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings , 2016, NAACL.
[18] Luke S. Zettlemoyer,et al. See No Evil, Say No Evil: Description Generation from Densely Labeled Images , 2014, *SEMEVAL.
[19] Donald Geman,et al. Visual Turing test for computer vision systems , 2015, Proceedings of the National Academy of Sciences.
[20] Ron Artstein,et al. Annotating (Anaphoric) Ambiguity , 2005 .
[21] Licheng Yu,et al. Visual Madlibs: Fill in the blank Image Generation and Question Answering , 2015, ArXiv.
[22] Licheng Yu,et al. Visual Madlibs: Fill in the Blank Description Generation and Question Answering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[23] Sanja Fidler,et al. What Are You Talking About? Text-to-Image Coreference , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[24] Dhruv Batra,et al. Active learning for structured probabilistic models with histogram approximation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Shimon Ullman,et al. Do You See What I Mean? Visual Resolution of Linguistic Ambiguities , 2015, EMNLP.
[26] Richard Szeliski,et al. A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Gregory Shakhnarovich,et al. Discriminative Re-ranking of Diverse Segmentations , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.
[29] Sanja Fidler,et al. The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[30] Gregory Shakhnarovich,et al. Diverse M-Best Solutions in Markov Random Fields , 2012, ECCV.
[31] Mario Fritz,et al. A Pooling Approach to Modelling Spatial Relations for Image Retrieval and Annotation , 2014, ArXiv.
[32] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] David Chiang,et al. Better k-best Parsing , 2005, IWPT.
[34] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[35] Dhruv Batra,et al. An Efficient Message-Passing Algorithm for the M-Best MAP Problem , 2012, UAI.
[36] Jiasen Lu,et al. VQA: Visual Question Answering , 2015, ICCV.
[37] Kobus Barnard,et al. Word sense disambiguation with pictures , 2003, HLT-NAACL 2003.
[38] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.
[39] Adwait Ratnaparkhi,et al. A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.
[40] Gregory Shakhnarovich,et al. A Systematic Exploration of Diversity in Machine Translation , 2013, EMNLP.