Spherical hashing

Many binary code encoding schemes based on hashing have been actively studied recently, since they can provide efficient similarity search, especially nearest neighbor search, and compact data representations suitable for handling large scale image databases in many computer vision problems. Existing hashing techniques encode high-dimensional data points by using hyperplane-based hashing functions. In this paper we propose a novel hypersphere-based hashing function, spherical hashing, to map more spatially coherent data points into a binary code compared to hyperplane-based hashing functions. Furthermore, we propose a new binary code distance function, spherical Hamming distance, that is tailored to our hypersphere-based binary coding scheme, and design an efficient iterative optimization process to achieve balanced partitioning of data points for each hash function and independence between hashing functions. Our extensive experiments show that our spherical hashing technique significantly outperforms six state-of-the-art hashing techniques based on hyperplanes across various image benchmarks of sizes ranging from one to 75 million of GIST descriptors. The performance gains are consistent and large, up to 100% improvements. The excellent results confirm the unique merits of the proposed idea in using hyperspheres to encode proximity regions in high-dimensional spaces. Finally, our method is intuitive and easy to implement.

[1]  Svetlana Lazebnik,et al.  Asymmetric Distances for Binary Embeddings , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[5]  Tanaka Yuzuru,et al.  Spherical LSH for Approximate Nearest Neighbor Search on Unit Hypersphere , 2007 .

[6]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[7]  Olivier Buisson,et al.  Random maximum margin hashing , 2011, CVPR 2011.

[8]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[10]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[11]  Yu-Wing Tai,et al.  Probabilistic cost model for nearest neighbor search in image retrieval , 2012, Comput. Vis. Image Underst..

[12]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[14]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[15]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  L. Greengard The Rapid Evaluation of Potential Fields in Particle Systems , 1988 .

[17]  Christos Faloutsos,et al.  Similarity search without tears: the OMNI-family of all-purpose access methods , 2001, Proceedings 17th International Conference on Data Engineering.

[18]  Regunathan Radhakrishnan,et al.  Compact hashing with joint optimization of search accuracy and time , 2011, CVPR 2011.

[19]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Tamer Kahveci,et al.  Reference-based indexing of sequence databases , 2006, VLDB.

[21]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[22]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[23]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Beng Chin Ooi,et al.  iDistance: An adaptive B+-tree based indexing method for nearest neighbor search , 2005, TODS.

[25]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[26]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.