Learning visual keywords for content-based retrieval

Today keyword-based teat retrieval systems have enjoyed reasonable success in real usage. Despite the simplicity of the keyword metaphor, practical test search engines are able to handle huge volumes of free text documents. In image and video retrieval, search mainly relies on pre-annotated words or/and primitive visual features. We propose the notion of visual keywords for content-based retrieval. The visual keywords of a given visual content domain are typical visual entities that are extracted from statistical learning. A visual content is spatially described in terms of the extracted visual keywords and coded via singular value decomposition for similarity matching. We demonstrate our framework in retrieval of natural scene images.

[1]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[2]  W.E.L. Grimson,et al.  Training templates for scene classification using a few examples , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[3]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[4]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[5]  Serge J. Belongie,et al.  Region-based image querying , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[6]  W. Eric L. Grimson,et al.  Configuration based scene classification and image indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.