Non-transitive Hashing with Latent Similarity Components

Approximating the semantic similarity between entities in the learned Hamming space is the key for supervised hashing techniques. The semantic similarities between entities are often non-transitive since they could share different latent similarity components. For example, in social networks, we connect with people for various reasons, such as sharing common interests, working in the same company, being alumni and so on. Obviously, these social connections are non-transitive if people are connected due to different reasons. However, existing supervised hashing methods treat the pairwise similarity relationships in a simple and unified way and project data into a single Hamming space, while neglecting that the non-transitive property cannot be ade- quately captured by a single Hamming space. In this paper, we propose a non-transitive hashing method, namely Multi-Component Hashing (MuCH), to identify the latent similarity components to cope with the non-transitive similarity relationships. MuCH generates multiple hash tables with each hash table corresponding to a similarity component, and preserves the non-transitive similarities in different hash table respectively. Moreover, we propose a similarity measure, called Multi-Component Similarity, aggregating Hamming similarities in multiple hash tables to capture the non-transitive property of semantic similarity. We conduct extensive experiments on one synthetic dataset and two public real-world datasets (i.e. DBLP and NUS-WIDE). The results clearly demonstrate that the proposed MuCH method significantly outperforms the state-of-art hashing methods especially on search efficiency.

[1]  Sanjiv Kumar,et al.  Angular Quantization-based Binary Codes for Fast Similarity Search , 2012, NIPS.

[2]  Shih-Fu Chang,et al.  Compact hashing for mixed image-keyword query over multi-label images , 2012, ICMR '12.

[3]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[4]  Lifeng Sun,et al.  Social Media Recommendation , 2013, Social Media Retrieval.

[5]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Tat-Seng Chua,et al.  Social-Sensed Image Search , 2014, TOIS.

[7]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[8]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[9]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[10]  Xianglong Liu,et al.  Reciprocal Hash Tables for Nearest Neighbor Search , 2013, AAAI.

[11]  Jun Wang,et al.  Comparing apples to oranges: a scalable solution with heterogeneous hashing , 2013, KDD.

[12]  WangJun,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012 .

[13]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[14]  Fei Wang,et al.  Scalable Recommendation with Social Contextual Information , 2014, IEEE Transactions on Knowledge and Data Engineering.

[15]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  David Suter,et al.  A General Two-Step Approach to Learning-Based Hashing , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Fei Wang,et al.  Social recommendation across multiple relational domains , 2012, CIKM.

[19]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[20]  Jun Wang,et al.  Probabilistic Attributed Hashing , 2015, AAAI.

[21]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jie Tang,et al.  Who will follow you back?: reciprocal relationship prediction , 2011, CIKM '11.

[23]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Geoffrey E. Hinton,et al.  Visualizing non-metric similarities in multiple maps , 2011, Machine Learning.

[25]  Shuicheng Yan,et al.  Learning reconfigurable hashing for diverse semantics , 2011, ICMR '11.

[26]  Ittai Abraham,et al.  Low-Distortion Inference of Latent Similarities from a Multiplex Social Network , 2012, SIAM J. Comput..

[27]  Alan M. Frieze,et al.  Min-wise independent permutations (extended abstract) , 1998, STOC '98.

[28]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[29]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[30]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Soravit Changpinyo,et al.  Similarity Component Analysis , 2013, NIPS.

[32]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[33]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[34]  Michael Szell,et al.  Multirelational organization of large-scale social networks in an online world , 2010, Proceedings of the National Academy of Sciences.

[35]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[36]  Charu C. Aggarwal,et al.  Factorized Similarity Learning in Networks , 2014, 2014 IEEE International Conference on Data Mining.

[37]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[38]  M. M. Meyer,et al.  Statistical Analysis of Multiple Sociometric Relations. , 1985 .

[39]  Shuicheng Yan,et al.  Non-Metric Locality-Sensitive Hashing , 2010, AAAI.

[40]  Qi Tian,et al.  Social-oriented visual image search , 2014, Comput. Vis. Image Underst..

[41]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[42]  Nenghai Yu,et al.  Complementary hashing for approximate nearest neighbor search , 2011, 2011 International Conference on Computer Vision.