Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing

For efficiently retrieving nearest neighbors from large-scale multiview data, recently hashing methods are widely investigated, which can substantially improve query speeds. In this paper, we propose an effective probability-based semantics-preserving hashing (SePH) method to tackle the problem of cross-view retrieval. Considering the semantic consistency between views, SePH generates one unified hash code for all observed views of any instance. For training, SePH first transforms the given semantic affinities of training data into a probability distribution, and aims to approximate it with another one in Hamming space, via minimizing their Kullback–Leibler divergence. Specifically, the latter probability distribution is derived from all pair-wise Hamming distances between to-be-learnt hash codes of the training data. Then with learnt hash codes, any kind of predictive models like linear ridge regression, logistic regression, or kernel logistic regression, can be learnt as hash functions in each view for projecting the corresponding view-specific features into hash codes. As for out-of-sample extension, given any unseen instance, the learnt hash functions in its observed views can predict view-specific hash codes. Then by deriving or estimating the corresponding output probabilities with respect to the predicted view-specific hash codes, a novel probabilistic approach is further proposed to utilize them for determining a unified hash code. To evaluate the proposed SePH, we conduct extensive experiments on diverse benchmark datasets, and the experimental results demonstrate that SePH is reasonable and effective.

[1]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Ting Liu,et al.  Clustering Billions of Images with Large Scale Nearest Neighbor Search , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Witold Pedrycz,et al.  Incremental Hashing for Semantic Image Retrieval in Nonstationary Environments , 2017, IEEE Transactions on Cybernetics.

[7]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[8]  Yao Hu,et al.  Fast and Accurate Hashing Via Iterative Nearest Neighbors Expansion , 2014, IEEE Transactions on Cybernetics.

[9]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Deng Cai,et al.  Density Sensitive Hashing , 2012, IEEE Transactions on Cybernetics.

[11]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[12]  Yizhou Wang,et al.  Quantized Correlation Hashing for Fast Cross-Modal Search , 2015, IJCAI.

[13]  Xuelong Li,et al.  Compact Structure Hashing via Sparse and Similarity Preserving Embedding , 2016, IEEE Transactions on Cybernetics.

[14]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[15]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[16]  Shih-Fu Chang,et al.  Locally Linear Hashing for Extracting Non-linear Manifolds , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Zi Huang,et al.  Robust Hashing With Local Models for Approximate Similarity Search , 2014, IEEE Transactions on Cybernetics.

[18]  Xuelong Li,et al.  Large-Scale Unsupervised Hashing with Shared Structure Learning , 2015, IEEE Transactions on Cybernetics.

[19]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wen Gao,et al.  Parametric Local Multimodal Hashing for Cross-View Similarity Search , 2013, IJCAI.

[21]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[22]  Xuelong Li,et al.  Spectral Multimodal Hashing and Its Application to Multimedia Retrieval , 2016, IEEE Transactions on Cybernetics.

[23]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[24]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[25]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[26]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[28]  Lei Zhu,et al.  Online Cross-Modal Hashing for Web Image Retrieval , 2016, AAAI.

[29]  Dong Xu,et al.  Dimensionality-Dependent Generalization Bounds for k-Dimensional Coding Schemes , 2016, Neural Computation.

[30]  Guiguang Ding,et al.  Accelerated Manhattan hashing via bit-remapping with location information , 2015, Multimedia Tools and Applications.

[31]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[32]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jürgen Schmidhuber,et al.  Multimodal Similarity-Preserving Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  I. Jolliffe Principal Component Analysis , 2002 .

[36]  Yiqiang Chen,et al.  Building Sparse Multiple-Kernel SVM Classifiers , 2009, IEEE Transactions on Neural Networks.

[37]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[38]  Andrew W. Moore,et al.  The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data , 2000, UAI.

[39]  Dacheng Tao,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Minyi Guo,et al.  Supervised hashing with latent factor models , 2014, SIGIR.

[41]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[43]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[44]  Zhou Yu,et al.  Discriminative coupled dictionary hashing for fast cross-media retrieval , 2014, SIGIR.

[45]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[46]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[47]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[48]  Xuelong Li,et al.  Spectral Embedded Hashing for Scalable Image Retrieval , 2014, IEEE Transactions on Cybernetics.

[49]  Zhou Yu,et al.  Cross-media hashing with kernel regression , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[50]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[51]  Qiang Liu,et al.  Kernel-based supervised hashing for cross-view similarity search , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[52]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[53]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[54]  Jingdong Wang,et al.  Collaborative Quantization for Cross-Modal Similarity Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Dacheng Tao,et al.  Multi-View Intact Space Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.