Every Picture Tells a Story: Generating Sentences from Images
暂无分享,去创建一个
Cyrus Rashtchian | Ali Farhadi | David A. Forsyth | Peter Young | Mohammad Amin Sadeghi | Julia Hockenmaier | Mohsen Hejrati | Ali Farhadi | D. Forsyth | M. Sadeghi | Cyrus Rashtchian | Peter Young | J. Hockenmaier | Mohsen Hejrati
[1] F. Quimby. What's in a picture? , 1993, Laboratory animal science.
[2] Dekang Lin,et al. An Information-Theoretic Definition of Similarity , 1998, ICML.
[3] Y. Mori,et al. Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .
[4] Richard Sproat,et al. WordsEye: an automatic text-to-scene conversion system , 2001, SIGGRAPH.
[5] David A. Forsyth,et al. Clustering art , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[6] Mads Nielsen,et al. Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.
[7] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.
[8] P. Jonathon Phillips,et al. Meta-analysis of face recognition algorithms , 2001, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.
[9] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[10] James Ze Wang,et al. Content-based image retrieval: approaches and trends of the new age , 2005, MIR '05.
[11] Ben Taskar,et al. Learning structured prediction models: a large margin approach , 2005, ICML.
[12] Nathan D. Ratliff,et al. Subgradient Methods for Maximum Margin Structured Learning , 2006 .
[13] Antonio Torralba,et al. Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.
[14] Johan Bos,et al. Linguistically Motivated Large-Scale NLP with C&C and Boxer , 2007, ACL.
[15] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[16] Larry S. Davis,et al. Objects in Action: An Approach for Combining Action Understanding and Object Perception , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[17] Andrew J. Davison,et al. Active Matching , 2008, ECCV.
[18] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[19] Thomas Mensink,et al. Improving People Search Using Query Expansions , 2008, ECCV.
[20] Thomas Mensink,et al. Improving People Search Using Query Expansions , 2008, ECCV.
[21] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Larry S. Davis,et al. Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers , 2008, ECCV.
[23] Derek Hoiem,et al. Pascal VOC 2008 Challenge , 2008 .
[24] Barbara Caputo,et al. Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation , 2009, NIPS.
[25] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Larry S. Davis,et al. Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[27] Li Fei-Fei,et al. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Cecilia Ovesdotter Alm,et al. Object Categorization: Words and Pictures: Categories, Modifiers, Depiction, and Iconography , 2009 .
[29] Liang Lin,et al. I2T: Image Parsing to Text Description , 2010, Proceedings of the IEEE.
[30] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[31] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.