Scalable Similarity Search With Topology Preserving Hashing

Hashing-based similarity search techniques is becoming increasingly popular in large data sets. To capture meaningful neighbors, the topology of a data set, which represents the neighborhood relationships between its subregions and the relative proximities between the neighbors of each subregion, e.g., the relative neighborhood ranking of each subregion, should be exploited. However, most existing hashing methods are developed to preserve neighborhood relationships while ignoring the relative neighborhood proximities. Moreover, most hashing methods lack in providing a good result ranking, since there are often lots of results sharing the same Hamming distance to a query. In this paper, we propose a novel hashing method to solve these two issues jointly. The proposed method is referred to as topology preserving hashing (TPH). TPH is distinct from prior works by also preserving the neighborhood ranking. Based on this framework, we present three different TPH methods, including linear unsupervised TPH, semisupervised TPH, and kernelized TPH. Particularly, our unsupervised TPH is capable of mining semantic relationship between unlabeled data without supervised information. Extensive experiments on four large data sets demonstrate the superior performances of the proposed methods over several state-of-the-art unsupervised and semisupervised hashing techniques.

[1]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Hervé Jégou,et al.  Anti-sparse coding for approximate nearest neighbor search , 2011, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Shih-Fu Chang,et al.  Query-Adaptive Image Search With Hash Codes , 2013, IEEE Transactions on Multimedia.

[4]  Yi Yang,et al.  Spline Regression Hashing for Fast Image Search , 2012, IEEE Transactions on Image Processing.

[5]  Yongdong Zhang,et al.  Binary Code Ranking with Weighted Hamming Distance , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Patrick Gros,et al.  Asymmetric hamming embedding: taking the best of our bits for large scale image search , 2011, ACM Multimedia.

[7]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[8]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[9]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Seungjin Choi,et al.  Semi-supervised Discriminant Hashing , 2011, 2011 IEEE 11th International Conference on Data Mining.

[11]  Benno Stein Principles of hash-based text retrieval , 2007, SIGIR.

[12]  Yongdong Zhang,et al.  A Prior-Free Weighting Scheme for Binary Code Ranking , 2014, IEEE Transactions on Multimedia.

[13]  Vishal Monga,et al.  Robust Video Hashing via Multilinear Subspace Projections , 2012, IEEE Transactions on Image Processing.

[14]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[16]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[17]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[20]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[21]  Qi Tian,et al.  Scalar quantization for large scale image search , 2012, ACM Multimedia.

[22]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[23]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[24]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[26]  Shumeet Baluja,et al.  Learning to hash: forgiving hash functions and applications , 2008, Data Mining and Knowledge Discovery.

[27]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[28]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[30]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[31]  Florent Perronnin,et al.  Large-scale image categorization with explicit data embedding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[33]  Qi Tian,et al.  Super-Bit Locality-Sensitive Hashing , 2012, NIPS.

[34]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[35]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[36]  Qi Tian,et al.  Packing and Padding: Coupled Multi-index for Accurate Image Retrieval , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[38]  Olivier Buisson,et al.  Random maximum margin hashing , 2011, CVPR 2011.

[39]  Feiping Nie,et al.  Cauchy Graph Embedding , 2011, ICML.

[40]  Olivier Buisson,et al.  A posteriori multi-probe locality sensitive hashing , 2008, ACM Multimedia.

[41]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[42]  Jon Louis Bentley,et al.  K-d trees for semidynamic point sets , 1990, SCG '90.

[43]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[44]  F. Perronnin,et al.  Nearest neighbor search for arbitrary kernels with explicit embeddings , 2012 .

[45]  Meng Wang,et al.  Spectral Hashing With Semantically Consistent Graph for Image Indexing , 2013, IEEE Transactions on Multimedia.

[46]  Yongdong Zhang,et al.  Topology preserving hashing for similarity search , 2013, MM '13.

[47]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.