Instance-Aware Hashing for Multi-Label Image Retrieval

Similarity-preserving hashing is a commonly used method for nearest neighbor search in large-scale image retrieval. For image retrieval, deep-network-based hashing methods are appealing, since they can simultaneously learn effective image representations and compact hash codes. This paper focuses on deep-network-based hashing for multi-label images, each of which may contain objects of multiple categories. In most existing hashing methods, each image is represented by one piece of hash code, which is referred to as semantic hashing. This setting may be suboptimal for multi-label image retrieval. To solve this problem, we propose a deep architecture that learns instance-aware image representations for multi-label image data, which are organized in multiple groups, with each group containing the features for one category. The instance-aware representations not only bring advantages to semantic hashing but also can be used in category-aware hashing, in which an image is represented by multiple pieces of hash codes and each piece of code corresponds to a category. Extensive evaluations conducted on several benchmark data sets demonstrate that for both the semantic hashing and the category-aware hashing, the proposed method shows substantial improvement over the state-of-the-art supervised and unsupervised hashing methods.

[1]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[5]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[6]  Bingbing Ni,et al.  HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[9]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[11]  Guosheng Lin,et al.  Learning Hash Functions Using Column Generation , 2013, ICML.

[12]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[15]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[16]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[17]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[18]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[21]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[22]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[24]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[25]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[27]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[28]  Vladlen Koltun,et al.  Geodesic Object Proposals , 2014, ECCV.

[29]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[30]  Yangqing Jia,et al.  Deep Convolutional Ranking for Multilabel Image Annotation , 2013, ICLR.

[31]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[32]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[33]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[34]  David Suter,et al.  A General Two-Step Approach to Learning-Based Hashing , 2013, 2013 IEEE International Conference on Computer Vision.

[35]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.