Text and content based image retrieval via locality sensitive hashing

We present a scalable image retrieval system based jointly on text annotations and visual content. Previous approaches in content based image retrieval often suffer from the semantic gap problem and long retrieving time. The solution that we propose aims at resolving these two issues by indexing and retrieving images using both their text descriptions and visual content, such as features in colour, texture and shape. A query in this system consists of keywords, a sample image and relevant parameters. The retrieving algorithm first selects a subset of images from the whole collection according to a comparison between the keywords and the text descriptions. Visual features extracted from the sample image are then compared with the extracted features of the images in the subset to select the closest. Because the features are represented by high-dimensional vectors, locality sensitive hashing is applied to the visual comparison to speedup the process. Experiments were performed on a collection of 1514 images. The timing results showed the potential of this solution to be scaled up to handle large image collections.

[1]  Erkki Oja,et al.  PicSOM-self-organizing image retrieval with MPEG-7 content descriptors , 2002, IEEE Trans. Neural Networks.

[2]  Alan F. Smeaton,et al.  Text based approaches for content-based image retrieval on large image collections , 2005 .

[3]  B. S. Manjunath,et al.  Cortina: a system for large-scale, content-based web image retrieval , 2004, MULTIMEDIA '04.

[4]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[5]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[6]  Thomas S. Huang,et al.  Unifying Keywords and Visual Contents in Image Retrieval , 2002, IEEE Multim..

[7]  K. L. Man,et al.  VLSI macromodeling and signal integrity analysis via digital signal processing techniques , 2011 .

[8]  Manuel Möller,et al.  A Generic Framework for Semantic Medical Image Retrieval , 2007, KAMC.

[9]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[10]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[11]  Mark Sanderson,et al.  The SPIRIT collection: an overview of a large web collection , 2004, SIGF.

[12]  Yan Ke,et al.  An efficient parts-based near-duplicate and sub-image retrieval system , 2004, MULTIMEDIA '04.

[13]  Ricardo Pérez-Aguila Automatic Segmentation and Classification of Computed Tomography Brain Images : An Approach Using One-Dimensional Kohonen Networks , 2009 .

[14]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[15]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Lisa Fan,et al.  A Hybrid Model of Image Retrieval Based on Ontology Technology and Probabilistic Ranking , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[17]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .

[18]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[19]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[20]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[21]  Michael G. Strintzis,et al.  Region-Based Image Retrieval Using an Object Ontology and Relevance Feedback , 2004, EURASIP J. Adv. Signal Process..

[22]  Adrian Popescu,et al.  Ontology driven content based image retrieval , 2007, CIVR '07.

[23]  Nan Zhang,et al.  An Image Indexing and Searching System Based Both on Keyword and Content , 2008, ICIC.

[24]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[25]  Dragutin Petkovic,et al.  The query by image content (QBIC) system , 1995, SIGMOD '95.

[26]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[27]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[28]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..