Content-based image retrieval with ontological ranking

Images are a much more powerful medium of expression than text, as the adage says: "One picture is worth a thousand words." It is because compared with text consisting of an array of words, an image has more degrees of freedom and therefore a more complicated structure. However, the less limited structure of images presents researchers in the computer vision community a tough task of teaching machines to understand and organize images, especially when a limit number of learning examples and background knowledge are given. The advance of internet and web technology in the past decade has changed the way human gain knowledge. People, hence, can exchange knowledge with others by discussing and contributing information on the web. As a result, the web pages in the internet have become a living and growing source of information. One is therefore tempted to wonder whether machines can learn from the web knowledge base as well. Indeed, it is possible to make computer learn from the internet and provide human with more meaningful knowledge. In this work, we explore this novel possibility on image understanding applied to semantic image search. We exploit web resources to obtain links from images to keywords and a semantic ontology constituting human's general knowledge. The former maps visual content to related text in contrast to the traditional way of associating images with surrounding text; the latter provides relations between concepts for machines to understand to what extent and in what sense an image is close to the image search query. With the aid of these two tools, the resulting image search system is thus content-based and moreover, organized. The returned images are ranked and organized such that semantically similar images are grouped together and given a rank based on the semantic closeness to the input query. The novelty of the system is twofold: first, images are retrieved not only based on text cues but their actual contents as well; second, the grouping is different from pure visual similarity clustering. More specifically, the inferred concepts of each image in the group are examined in the context of a huge concept ontology to determine their true relations with what people have in mind when doing image search.

[1]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[4]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[5]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[7]  Daphna Weinshall,et al.  Exploiting Object Hierarchy: Combining Models from Different Category Levels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Yanmei Chai,et al.  OntoAlbum: An Ontology Based Digital Photo Management System , 2008, ICIAR.

[9]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[10]  Dan I. Moldovan,et al.  Exploiting ontologies for automatic image annotation , 2005, SIGIR '05.

[11]  Jianping Fan,et al.  Hierarchical classification for automatic image annotation , 2007, SIGIR.

[12]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[13]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[14]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[16]  Andreas Stolcke,et al.  Within-class covariance normalization for SVM-based speaker recognition , 2006, INTERSPEECH.

[17]  Liang-Tien Chia,et al.  Wikipedia2Onto --- Adding Wikipedia Semantics to Web Image Retrieval , 2009 .

[18]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.