论文信息 - Data-Dependent Hashing Based on p-Stable Distribution

Data-Dependent Hashing Based on p-Stable Distribution

The p-stable distribution is traditionally used for data-independent hashing. In this paper, we describe how to perform data-dependent hashing based on p-stable distribution. We commence by formulating the Euclidean distance preserving property in terms of variance estimation. Based on this property, we develop a projection method, which maps the original data to arbitrary dimensional vectors. Each projection vector is a linear combination of multiple random vectors subject to p-stable distribution, in which the weights for the linear combination are learned based on the training data. An orthogonal matrix is then learned data-dependently for minimizing the thresholding error in quantization. Combining the projection method and orthogonal matrix, we develop an unsupervised hashing scheme, which preserves the Euclidean distance. Compared with data-independent hashing methods, our method takes the data distribution into consideration and gives more accurate hashing results with compact hash codes. Different from many data-dependent hashing methods, our method accommodates multiple hash tables and is not restricted by the number of hash functions. To extend our method to a supervised scenario, we incorporate a supervised label propagation scheme into the proposed projection method. This results in a supervised hashing scheme, which preserves semantic similarity of data. Experimental results show that our methods have outperformed several state-of-the-art hashing approaches in both effectiveness and efficiency.

Jun Zhou | Peng Ren | Jian Cheng | Xiao Bai | Haichuan Yang

[1] Meng Wang,et al. Spectral Hashing With Semantically Consistent Graph for Image Indexing , 2013, IEEE Transactions on Multimedia.

[2] Wu-Jun Li,et al. Isotropic Hashing , 2012, NIPS.

[3] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..

[4] Pascal Fua,et al. LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[6] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.

[7] Wu-Jun Li,et al. Double-Bit Quantization for Hashing , 2012, AAAI.

[8] Shih-Fu Chang,et al. Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[9] Nenghai Yu,et al. Complementary hashing for approximate nearest neighbor search , 2011, 2011 International Conference on Computer Vision.

[10] Xianglong Liu,et al. Reciprocal Hash Tables for Nearest Neighbor Search , 2013, AAAI.

[11] Shih-Fu Chang,et al. Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[13] Jian Sun,et al. K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] V. Zolotarev. One-dimensional stable distributions , 1986 .

[15] Hans-Jörg Schek,et al. A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[16] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[17] P. Schönemann,et al. A generalized solution of the orthogonal procrustes problem , 1966 .

[18] Prateek Jain,et al. Fast Similarity Search for Learned Metrics , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Rongrong Ji,et al. Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Piotr Indyk,et al. Stable distributions, pseudorandom generators, embeddings, and data stream computation , 2006, JACM.

[21] WangJun,et al. Semi-Supervised Hashing for Large-Scale Search , 2012 .

[22] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.

[23] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[24] Kristen Grauman,et al. Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[26] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[27] Svetlana Lazebnik,et al. Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[28] Antonio Torralba,et al. Multidimensional Spectral Hashing , 2012, ECCV.

[29] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[30] Trevor Darrell,et al. Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[31] David J. Fleet,et al. Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[32] IndykPiotr. Stable distributions, pseudorandom generators, embeddings, and data stream computation , 2006 .

[33] Shuicheng Yan,et al. Weakly-supervised hashing in kernel space , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34] Wei Liu,et al. Hashing with Graphs , 2011, ICML.

[35] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[36] Fumin Shen,et al. Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38] Minyi Guo,et al. Manhattan hashing for large-scale image retrieval , 2012, SIGIR '12.

[39] Jun Zhou,et al. Label propagation hashing based on p-stable distribution and coordinate descent , 2013, 2013 IEEE International Conference on Image Processing.

[40] Svetlana Lazebnik,et al. Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[41] Nicolas Le Roux,et al. Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[42] Olivier Buisson,et al. Random maximum margin hashing , 2011, CVPR 2011.

[43] Mikhail Belkin,et al. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[44] Zoubin Ghahramani,et al. Learning from labeled and unlabeled data with label propagation , 2002 .

[45] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[46] Jun Wang,et al. Self-taught hashing for fast similarity search , 2010, SIGIR.

[47] Shih-Fu Chang,et al. Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .