Visual Keyword-based Image Retrieval using Latent Semantic Indexing, Correlation-enhanced Similarity Matching and Query Expansion in Inverted Index

This paper presents an image retrieval framework with scalable image representation and inverted file-based indexing by incorporating automatically generated visual keywords. A codebook of visual keywords is implemented adopting a self-organizing map (SOM)-based vector quantization on the feature space of segmented image regions. The codebook is utilized to represent images by calculating the keyword statistics in the individual images as well as in the collection as a whole. To reduce the dimensionality of the sparse feature vector, latent semantic indexing technique is applied and a similarity matching function is proposed by exploiting the correlation between visual keywords. A query expansion strategy is also proposed in the inverted index based on the topology preserving structure of the SOM. Experimental results over a collection of 5000 general photographic images demonstrate the efficiency and effectiveness of the proposed approach compared to the low-level histogram-based approaches

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Bo Zhang,et al.  An efficient and effective region-based image retrieval framework , 2004, IEEE Transactions on Image Processing.

[4]  Mikko Kurimo Indexing Audio Documents by using Latent Semantic Analysis and SOM , 1999 .

[5]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[6]  Kai-Kuang Ma,et al.  Colour Image Indexing Using SOM for Region-of-Interest Retrieval , 1999, Pattern Analysis & Applications.

[7]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[8]  Marco La Cascia,et al.  Unifying Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web , 1999, Comput. Vis. Image Underst..

[9]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[10]  Thierry Pun,et al.  Efficient access methods for content-based image retrieval with inverted files , 1999, Optics East.

[11]  Joo-Hwee Lim Explicit query formulation with visual keywords , 2000, ACM Multimedia.

[12]  Lei Zhu,et al.  Keyblock: an approach for content-based image retrieval , 2000, ACM Multimedia.

[13]  Ramin Zabih,et al.  Comparing images using color coherence vectors , 1997, MULTIMEDIA '96.

[14]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[16]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .