Analysis of User Image Descriptions and Automatic Image Indexing Vocabularies: An Exploratory Study

This study explores the terms assigned by users to index, manage, and describe images and compares them to indexing terms derived automatically by systems for image retrieval. Results of this study indicate that userderived indexing vocabulary largely reflects what users see in the image or what they perceive as the overall topic of an image. This is in contrast to system-derived indexing wherein terms are extracted from existing text surrounding the image. In many cases, the surrounding text does not describe the image, rather, the image is used to illustrate or expand upon the text. Systemderived vocabulary may describe higher level concepts, for example, industrial pollution rather than smoke. The paper concludes with suggestions for the use of natural language processing techniques to provide vocabulary alignment in image retrieval.