论文信息 - Automatic image captioning

Automatic image captioning

We examine the problem of automatic image captioning. Given a training set of captioned images, we want to discover correlations between image features and keywords, so that we can automatically find good keywords for a new image. We experiment thoroughly with multiple design alternatives on large datasets of various content styles, and our proposed methods achieve up to a 45% relative improvement on captioning accuracy over the state of the art.

[1] Abby Goodrum,et al. Image Information Retrieval: An Overview of Current Research , 2000, Informing Sci. Int. J. an Emerg. Transdiscipl..

[2] Eero Sormunen,et al. End-User Searching Challenges Indexing Practices in the Digital Newspaper Photo Archive , 2004, Information Retrieval.

[3] Richard A. Harshman,et al. Information retrieval using a singular value decomposition model of latent semantic structure , 1988, SIGIR '88.

[4] Robert F. Murphy,et al. Automated determination of protein subcellular locations from 3D fluorescence microscope images , 2002, Proceedings IEEE International Symposium on Biomedical Imaging.

[5] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[6] Greg Hamerly,et al. Learning the k in k-means , 2003, NIPS.

[7] Daniel Tretter,et al. A Web-Based Secure System for the Distributed Printing of Documents and Images , 1998, J. Vis. Commun. Image Represent..

[8] Jean Ponce,et al. Computer Vision: A Modern Approach , 2002 .

[9] Daniel Gatica-Perez,et al. On image auto-annotation with latent space models , 2003, ACM Multimedia.

[10] R. Manmatha,et al. Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[11] James Ze Wang,et al. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[13] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.

[14] Y. Mori,et al. Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .