Automatic Indexing and Content-Based Retrieval of Captioned Images

The interaction of textual and photographic information in an integrated text/image database environment is being explored. Specifically, our research group has developed an automatic indexing system for captioned pictures of people; the indexing information and other textual information is subsequently used in a content-based image retrieval system. Our approach presents an alternative to traditional face identification systems; it goes beyond a superficial combination of existing text-based and image-based approaches to information retrieval. By understanding the caption accompanying a picture, we can extract information that is useful both for retrieving the picture and for identifying the faces shown. In designing a pictorial database system, two major issues are (1) the amount and type of processing required when inserting new pictures into the database and (2) efficient retrieval schemes for query processing. Our research has focused on developing a computational model for understanding pictures based on accompanying descriptive text. Understanding a picture can be informally defined as the process of identifying relevant people and objects. Several current vision systems employ the idea of top-down control in picture understanding. We carry the notion of top-down control one step further, exploiting not only general context but also picture-specific context. >