Clustering and semantically filtering web images to create a large-scale image ontology

In our effort to contribute to the closing of the "semantic gap" between images and their semantic description, we are building a large-scale ontology of images of objects. This visual catalog will contain a large number of images of objects, structured in a hierarchical catalog, allowing image processing researchers to derive signatures for wide classes of objects. We are building this ontology using images found on the web. We describe in this article our initial approach for finding coherent sets of object images. We first perform two semantic filtering steps: the first involves deciding which words correspond to objects and using these words to access databases which index text found associated with an image (e.g. Google Image search) to find a set of candidate images; the second semantic filtering step involves using face recognition technology to remove images of people from the candidate set (we have found that often requests for objects return images of people). After these two steps, we have a cleaner set of candidate images for each object. We then index and cluster the remaining images using our system VIKA (VIsual KAtaloguer) to find coherent sets of objects.

[1]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[2]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[3]  Larry L. Peterson,et al.  Reasoning about naming systems , 1993, TOPL.

[4]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[5]  Thomas Deselaers,et al.  Clustering visually similar images to improve image search engines , 2003 .

[6]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.

[7]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[8]  Svitlana Zinger,et al.  Extracting an ontology of portrayable objects from Wordnet , 2005 .

[9]  Mario A. Nascimento,et al.  A compact and efficient image retrieval approach based on border/interior pixel classification , 2002, CIKM '02.

[10]  Vipin Kumar,et al.  Finding Topics in Collections of Documents: A Shared Nearest Neighbor Approach , 2003, Clustering and Information Retrieval.

[11]  Monique Thonnat,et al.  A Weakly Supervised Approach for Semantic Image Indexing and Retrieval , 2005, CIVR.

[12]  R. Jarvis,et al.  ClusteringUsing a Similarity Measure Based on SharedNear Neighbors , 1973 .

[13]  Pierre-Alain Moëllic,et al.  PIRIA: a general tool for indexing, search, and retrieval of multimedia content , 2004, IS&T/SPIE Electronic Imaging.

[14]  Pinar Duygulu Translating Images to Words: A Novel Approach for Object Recognition , 2003 .