Combining text and image information in content-based retrieval

This research explores the interaction of textual and photographic information in an integrated text/image database environment. By understanding the caption accompanying a picture, we are able to extract information useful in (i) retrieving the picture and (ii) directing an image interpretation system to identify relevant objects (in this case, faces) in the picture. The latter constitutes a powerful technique for automatically indexing images. In cases where images are not accompanied by text, it is far easier to manually add a line of descriptive text than to manually truth the images. A multi-stage system, PICTION, which uses captions to identify human faces in an accompanying photograph has been developed. We discuss the use of PICTION's output in content-based retrieval of images to satisfy focus of attention in queries.