Optimized Distances for Binary Code Ranking

Binary encoding on high-dimensional data points has attracted much attention due to its computational and storage efficiency. While numerous efforts have been made to encode data points into binary codes, how to calculate the effective distance on binary codes to approximate the original distance is rarely addressed. In this paper, we propose an effective distance measurement for binary code ranking. In our approach, the binary code is firstly decomposed into multiple sub codes, each of which generates a query-dependent distance lookup table. Then the distance between the query and the binary code is constructed as the aggregation of the distances from all sub codes by looking up their respective tables. The entries of the lookup tables are optimized by minimizing the misalignment between the approximate distance and the original distance. Such a scheme is applied to both the symmetric distance and the asymmetric distance. Extensive experimental results show superior performance of the proposed approach over state-of-the-art methods on three real-world high-dimensional datasets for binary code ranking.

[1]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[5]  Heng Tao Shen,et al.  Optimized Cartesian K-Means , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Jingdong Wang,et al.  Inner Product Similarity Search using Compositional Codes , 2014, ArXiv.

[7]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[8]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[9]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[10]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[12]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[13]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[14]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xiao Zhang,et al.  QsRank: Query-sensitive hash code ranking for efficient ∊-neighbor search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[19]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[20]  Shuicheng Yan,et al.  Fashion Parsing With Weak Color-Category Labels , 2014, IEEE Transactions on Multimedia.

[21]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[23]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Patrick Gros,et al.  Asymmetric hamming embedding: taking the best of our bits for large scale image search , 2011, ACM Multimedia.

[25]  Yongdong Zhang,et al.  Binary Code Ranking with Weighted Hamming Distance , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[27]  Qi Tian,et al.  Super-Bit Locality-Sensitive Hashing , 2012, NIPS.

[28]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[29]  Changsheng Xu,et al.  Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Guosheng Lin,et al.  Learning Hash Functions Using Column Generation , 2013, ICML.

[31]  Tamara L. Berg,et al.  Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jian Dong,et al.  A Deformable Mixture Parsing Model with Parselets , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Nenghai Yu,et al.  Complementary hashing for approximate nearest neighbor search , 2011, 2011 International Conference on Computer Vision.

[36]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Panos Kalnis,et al.  Efficient and accurate nearest neighbor and closest pair search in high-dimensional space , 2010, TODS.

[38]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[40]  Meng Wang,et al.  Movie2Comics: Towards a Lively Video Content Presentation , 2012, IEEE Transactions on Multimedia.

[41]  Zi Huang,et al.  Sparse hashing for fast multimedia search , 2013, TOIS.

[42]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[43]  Nenghai Yu,et al.  Order preserving hashing for approximate nearest neighbor search , 2013, ACM Multimedia.

[44]  Svetlana Lazebnik,et al.  Asymmetric Distances for Binary Embeddings , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[46]  Minyi Guo,et al.  Manhattan hashing for large-scale image retrieval , 2012, SIGIR '12.

[47]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[48]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[49]  Fei-Fei Li,et al.  Discriminative Segment Annotation in Weakly Labeled Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[51]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[52]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[53]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.

[54]  Roberto Cipolla,et al.  Label propagation in video sequences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.