Mixed image-keyword query adaptive hashing over multilabel images

This article defines a new hashing task motivated by real-world applications in content-based image retrieval, that is, effective data indexing and retrieval given mixed query (query image together with user-provided keywords). Our work is distinguished from state-of-the-art hashing research by two unique features: (1) Unlike conventional image retrieval systems, the input query is a combination of an exemplar image and several descriptive keywords, and (2) the input image data are often associated with multiple labels. It is an assumption that is more consistent with the realistic scenarios. The mixed image-keyword query significantly extends traditional image-based query and better explicates the user intention. Meanwhile it complicates semantics-based indexing on the multilabel data. Though several existing hashing methods can be adapted to solve the indexing task, unfortunately they all prove to suffer from low effectiveness. To enhance the hashing efficiency, we propose a novel scheme “boosted shared hashing”. Unlike prior works that learn the hashing functions on either all image labels or a single label, we observe that the hashing function can be more effective if it is designed to index over an optimal label subset. In other words, the association between labels and hash bits are moderately sparse. The sparsity of the bit-label association indicates greatly reduced computation and storage complexities for indexing a new sample, since only limited number of hashing functions will become active for the specific sample. We develop a Boosting style algorithm for simultaneously optimizing both the optimal label subsets and hashing functions in a unified formulation, and further propose a query-adaptive retrieval mechanism based on hash bit selection for mixed queries, no matter whether or not the query words exist in the training data. Moreover, we show that the proposed method can be easily extended to the case where the data similarity is gauged by nonlinear kernel functions. Extensive experiments are conducted on standard image benchmarks like CIFAR-10, NUS-WIDE and a-TRECVID. The results validate both the sparsity of the bit-label association and the convergence of the proposed algorithm, and demonstrate that the proposed hashing scheme achieves substantially superior performances over state-of-the-art methods under the same hash bit budget.

[1]  Sanjiv Kumar,et al.  On the Difficulty of Nearest Neighbor Search , 2012, ICML.

[2]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Wei Liu,et al.  Scalable similarity search with optimized kernel hashing , 2010, KDD.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[6]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[7]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[9]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[10]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[11]  Rongrong Ji,et al.  Weak attributes for large-scale image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[13]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[14]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[15]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[16]  Di Liu,et al.  Compact kernel hashing with multiple features , 2012, ACM Multimedia.

[17]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[18]  Newton Lee,et al.  ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP) , 2007, CIE.

[19]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[20]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[21]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[22]  Rong Yan,et al.  Model-shared subspace boosting for multi-label classification , 2007, KDD '07.

[23]  Alan M. Frieze,et al.  Min-wise independent permutations (extended abstract) , 1998, STOC '98.

[24]  Shuicheng Yan,et al.  Learning reconfigurable hashing for diverse semantics , 2011, ICMR '11.

[25]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[26]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[27]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[28]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[29]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[30]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[34]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[35]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[37]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[38]  Shuicheng Yan,et al.  Weakly-supervised hashing in kernel space , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[41]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[42]  Shih-Fu Chang,et al.  Compact hashing for mixed image-keyword query over multi-label images , 2012, ICMR '12.

[43]  Shuicheng Yan,et al.  Multimedia semantics-aware query-adaptive hashing with bits reconfigurability , 2012, International Journal of Multimedia Information Retrieval.

[44]  Meng Wang,et al.  Spectral Hashing With Semantically Consistent Graph for Image Indexing , 2013, IEEE Transactions on Multimedia.

[45]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[46]  ChangShih-Fu,et al.  Mixed image-keyword query adaptive hashing over multilabel images , 2014 .

[47]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[48]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.