SIFT features are widely used in content based image retrieval. Typically, a few thousand keypoints are extracted from each image. Image matching involves distance computations across all pairs of SIFT feature vectors from both images, which is quite costly. We show that SIFT features perform surprisingly well even after quantizing each component to binary, when the medians are used as the quantization thresholds. Quantized features preserve both distinctiveness and matching properties. Almost all of the features in our 5.4 million feature test set map to distinct binary patterns after quantization. Furthermore, number of matches between images using both the original and the binary quantized SIFT features are quite similar. We investigate the distribution of SIFT features and observe that the space of 128-D binary vectors has sufficient capacity for the current performance of SIFT features. We use component median values as quantization thresholds and show through vector-to-vector distance comparisons and image-to-image matches that the resulting binary vectors perform comparable to original SIFT vectors. We also discuss computational and storage gains. Binary vector distance computation reduces to bit-wise operations. Square operation is eliminated. Fast and efficient indexing techniques such as the signatures used for chemical databases can also be considered.
[1]
Cordelia Schmid,et al.
Scale & Affine Invariant Interest Point Detectors
,
2004,
International Journal of Computer Vision.
[2]
Trevor Darrell,et al.
Efficient image matching with distributions of local invariant features
,
2005,
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[3]
Marcel Worring,et al.
Content-Based Image Retrieval at the End of the Early Years
,
2000,
IEEE Trans. Pattern Anal. Mach. Intell..
[4]
Jon M. Kleinberg,et al.
Mapping the world's photos
,
2009,
WWW '09.
[5]
James Ze Wang,et al.
Image retrieval: Ideas, influences, and trends of the new age
,
2008,
CSUR.
[6]
Matthijs C. Dorst.
Distinctive Image Features from Scale-Invariant Keypoints
,
2011
.