Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval

Unsupervised hashing can desirably support scalable content-based image retrieval for its appealing advantages of semantic label independence, memory, and search efficiency. However, the learned hash codes are embedded with limited discriminative semantics due to the intrinsic limitation of image representation. To address the problem, in this paper, we propose a novel hashing approach, dubbed as discrete semantic transfer hashing (DSTH). The key idea is to directly augment the semantics of discrete image hash codes by exploring auxiliary contextual modalities. To this end, a unified hashing framework is formulated to simultaneously preserve visual similarities of images and perform semantic transfer from contextual modalities. Furthermore, to guarantee direct semantic transfer and avoid information loss, we explicitly impose the discrete constraint, bit-uncorrelation constraint, and bit-balance constraint on hash codes. A novel and effective discrete optimization method based on augmented Lagrangian multiplier is developed to iteratively solve the optimization problem. The whole learning process has linear computation complexity and desirable scalability. Experiments on three benchmark data sets demonstrate the superiority of DSTH compared with several state-of-the-art approaches.

[1]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[2]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[3]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[4]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[5]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[6]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[7]  Lei Zhu,et al.  Unsupervised Topic Hypergraph Hashing for Efficient Mobile Image Retrieval , 2017, IEEE Transactions on Cybernetics.

[8]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Guosheng Lin,et al.  Supervised Hashing Using Graph Cuts and Boosted Decision Trees , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[11]  Hai Jin,et al.  Weighting scheme for image retrieval based on bag-of-visual-words , 2014, IET Image Process..

[12]  Yi Yang,et al.  Bi-Level Semantic Representation Analysis for Multimedia Event Detection , 2017, IEEE Transactions on Cybernetics.

[13]  Yi Yang,et al.  Semantic Pooling for Complex Event Analysis in Untrimmed Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Seungjin Choi,et al.  Sequential Spectral Learning to Hash with Multiple Representations , 2012, ECCV.

[15]  Aleix M. Martínez,et al.  Bayes Optimality in Linear Discriminant Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[17]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[18]  Yang Yang,et al.  A Fast Optimization Method for General Binary Code Learning , 2016, IEEE Transactions on Image Processing.

[19]  Wu-Jun Li,et al.  Scalable Graph Hashing with Feature Transformation , 2015, IJCAI.

[20]  Kien A. Hua,et al.  Linear Subspace Ranking Hashing for Cross-Modal Retrieval , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[22]  Lei Zhu,et al.  Online Cross-Modal Hashing for Web Image Retrieval , 2016, AAAI.

[23]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[24]  Ling Shao,et al.  Multiview Alignment Hashing for Efficient Image Search , 2015, IEEE Transactions on Image Processing.

[25]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[26]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Zi Huang,et al.  Exploring Consistent Preferences: Discrete Hashing with Pair-Exemplar for Scalable Landmark Search , 2017, ACM Multimedia.

[29]  T. Millar,et al.  The UMIST Database for Astrochemistry 1995 , 1997 .

[30]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[31]  Yue Gao,et al.  Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing , 2016, IEEE Transactions on Image Processing.

[32]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Lei Zhu,et al.  Unsupervised Visual Hashing with Semantic Assistant for Content-Based Image Retrieval , 2017, IEEE Transactions on Knowledge and Data Engineering.

[34]  Xiaojun Chang,et al.  Semisupervised Feature Analysis by Mining Correlations Among Multiple Tasks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Lin Yang,et al.  Kernel-Based Supervised Discrete Hashing for Image Retrieval , 2016, ECCV.

[36]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Wei Liu,et al.  Discrete Graph Hashing , 2014, NIPS.

[39]  Zhi-Hua Zhou,et al.  Column Sampling Based Discrete Supervised Hashing , 2016, AAAI.

[40]  Dit-Yan Yeung,et al.  Worst-Case Linear Discriminant Analysis , 2010, NIPS.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms , 1993 .

[43]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[44]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[45]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[46]  Michael L. Overton,et al.  Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices , 2015, Math. Program..

[47]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Lei Zhu,et al.  Topic Hypergraph Hashing for Mobile Image Retrieval , 2015, ACM Multimedia.

[49]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[50]  Lei Zhu,et al.  Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval , 2016, Multimedia Tools and Applications.

[51]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[52]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Lei Zhu,et al.  Cross-Modal Self-Taught Hashing for large-scale image retrieval , 2016, Signal Process..

[55]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[56]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[57]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Wei Liu,et al.  Learning to Hash for Indexing Big Data—A Survey , 2015, Proceedings of the IEEE.

[59]  I. Jolliffe Principal Component Analysis , 2002 .

[60]  Wei Liu,et al.  Coordinate Discrete Optimization for Efficient Cross-View Image Retrieval , 2016, IJCAI.

[61]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[63]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[64]  Hai Jin,et al.  Content-Based Visual Landmark Search via Multimodal Hypergraph Learning , 2015, IEEE Transactions on Cybernetics.

[65]  Lei Zhu,et al.  Learning Compact Visual Representation with Canonical Views for Robust Mobile Landmark Search , 2016, IJCAI.

[66]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[67]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[68]  Zhang Ren,et al.  A Variant of the Trace Quotient Formulation for Dimensionality Reduction , 2009, ACCV.

[69]  Xuelong Li,et al.  Latent Semantic Minimal Hashing for Image Retrieval , 2017, IEEE Transactions on Image Processing.

[70]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[71]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[72]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[73]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Seungjin Choi,et al.  Multi-view anchor graph hashing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[75]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[76]  Hai Jin,et al.  Landmark Classification With Hierarchical Multi-Modal Exemplar Feature , 2015, IEEE Transactions on Multimedia.

[77]  Fumin Shen,et al.  Multi-view Latent Hashing for Efficient Multimedia Search , 2015, ACM Multimedia.