Learning reconfigurable hashing for diverse semantics

In recent years, locality-sensitive hashing (LSH) has gained plenty of attention from both the multimedia and computer vision communities due to its empirical success and theoretic guarantee in large-scale visual indexing and retrieval. Conventional LSH algorithms are designated either for generic metrics such as Cosine similarity, ℓ2-norm and Jaccard index, or for the metrics learned from user-supplied supervision information. The common drawbacks of existing algorithms are their incapability to be adapted to metric changes, along with the inefficacy when handling diverse semantics (e. g., more than 1K different categories in the well-known ImageNet database). For the metrics underlying the hashing structure, even tiny changes tend to nullify previous indexing efforts, which motivates our proposed framework towards "reconfigurable hashing". The basic idea is to maintain a large pool of over-complete hashing functions embedded in the ambient feature space, which serves as the common infrastructure of high-level diverse semantics. At the runtime, the algorithm dynamically selects relevant hashing bits by maximizing the consistency to specific semantics-induced metric, thereby achieving reusability of the pre-computed hashing bits. Such a reusable scheme especially benefits the indexing and retrieval of large-scale dataset, since it facilitates one-off indexing rather than continuous computation-intensive maintenance towards metric adaptation. We propose a sequential bit-selection algorithm based on local consistency and global regularization. Extensive studies are conducted on large-scale image benchmarks to comparatively investigate the performance of different strategies on reconfigurable hashing. Despite the vast literature on hashing, to our best knowledge rare endeavors have been spent toward the reusability of hashing structures in large-scale datasets.

[1]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[2]  Fei Wang,et al.  Feature Extraction by Maximizing the Average Neighborhood Margin , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Frédéric Jurie,et al.  Randomized Clustering Forests for Image Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[6]  Loong Fah Cheong,et al.  Randomized Locality Sensitive Vocabularies for Bag-of-Features Model , 2010, ECCV.

[7]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[8]  Wei Liu,et al.  Scalable similarity search with optimized kernel hashing , 2010, KDD.

[9]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[10]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[11]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[12]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[13]  Zhe Wang,et al.  Efficiently matching sets of features with random histograms , 2008, ACM Multimedia.

[14]  Meng Wang,et al.  Concept representation based video indexing , 2009, SIGIR.

[15]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[16]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[17]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[18]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[19]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Shuicheng Yan,et al.  Weakly-supervised hashing in kernel space , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[23]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.