WARank: Weighted Asymmetric Ranking for Approximate Nearest Neighbor Search

Binary hashing based methods have been widely used for large-scale approximate nearest neighbor search because of their two benefits: less memory usage and high search efficiency. In these methods, binary code ranking is usually implemented based on Hamming distance or asymmetric distance. Generally, asymmetric distance is more accurate than Hamming distance, thus recent work focuses on the asymmetric distance ranking. In existing asymmetric distance ranking, query-independent values are approximated by sample average values. However, when the distribution of data is not uniform, sample average values are not representative, leading to wrong ranking results. To address this problem, we propose Weighted Asymmetric Distance Ranking (WARank) algorithm which consists of two parts. First, we present an otsu threshold-based method to obtain more appropriate query-independent values in which the otsu threshold performs almost the same with the average value when the distribution of the data is uniform but much better when it is not uniform. Second, we present bit-level weight calculation method by which we can assign different weights to different bits in order to minimize the negative effect of any bit without uniform distribution. The experiments on public datasets show that the proposed WARank algorithm further increases the search accuracy compared to state-of-the-art methods.

[1]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[2]  Jon Louis Bentley,et al.  K-d trees for semidynamic point sets , 1990, SCG '90.

[3]  Shih-Fu Chang,et al.  Lost in binarization: query-adaptive ranking for similar image search with compact codes , 2011, ICMR '11.

[4]  Svetlana Lazebnik,et al.  Asymmetric Distances for Binary Embeddings , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yao Hu,et al.  A Unified Approximate Nearest Neighbor Search Scheme by Combining Data Structure and Hashing , 2013, IJCAI.

[6]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[7]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Xiao Zhang,et al.  QsRank: Query-sensitive hash code ranking for efficient ∊-neighbor search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yongdong Zhang,et al.  Binary Code Ranking with Weighted Hamming Distance , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[13]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[14]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[16]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[17]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .