Vocabulary-based hashing for image search

This paper proposes a hash function family based on feature vocabularies and investigates the application in building indexes for image search. Each hash function is associated with a set of feature points, i.e. a vocabulary, and maps an input point to the ID of the nearest one in the vocabulary. The function family can be employed to build a high-dimensional index for approximate nearest neighbor search. Then we concentrate on its application in image search. Guiding rules for the construction of the vocabularies are derived, which improve the effectiveness of the approach in this context by taking advantage of the data distribution. The rules are applied to design an algorithm for vocabulary construction in practice. Experiments show promising performance of the approach and the effectiveness of the guiding rules. Comparison with the popular Euclidean locality-sensitive hashing also shows the advantage of our approach in image search.