Attention Model Based SIFT Keypoints Filtration for Image Retrieval

Effective feature extraction is a fundamental component of content-based image retrieval. Scale Invariant Feature Transform (SIFT) has been proven to be the most robust local invariant feature descriptor. However, SIFT algorithm generates hundreds of thousands of keypoints per image, and most of them comes from background. This has seriously affected the application of SIFT in real-time image retrieval. This paper addresses this problem and proposes a novel method to filter the SIFT keypoints using attention model. Based on visual attention analysis, all of the keypoints in an image are ranked with their attention saliency, and only the most distinctive keypoints will be reserved. Then we use Bag of words to efficiently index these features. Experiments demonstrate that the attention model based SIFT keypoints filtration algorithm provides significant benefits both in retrieval accuracy and matching speed.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[3]  Luc Van Gool,et al.  Simultaneous Object Recognition and Segmentation by Image Exploration , 2004, ECCV.

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Aly A. Farag,et al.  CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[8]  Christof Koch,et al.  Visual attention and target detection in cluttered natural scenes , 2001 .

[9]  HongJiang Zhang,et al.  Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.