Cross-Modal Search on Social Networking Systems by Exploring Wikipedia Concepts

The increasing popularity of social networking systems (SNSs) has created large quantities of data from multiple modalities such as text and image. Retrieval of data, however, is constrained to a specific modality. Moreover, text on SNSs is usually short and noisy, and remains active for a (short) period. Such characteristics, conflicting with settings of traditional text search techniques, render them ineffective in SNSs. To alleviate these problems and bridge the gap between searches over different modalities, we propose a new algorithm that supports cross-modal search about social documents as text and images on SNSs. By exploiting Wikipedia concepts, text and images are transformed into a set of common concepts, based on which searches are conducted. A new ranking algorithm is designed to rank social documents based on their informativeness and semantic relevance to a query. We evaluate our ranking algorithm on both Twitter and Facebook datasets. The results confirm the effectiveness of our approach.

[1]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[2]  Xianpei Han,et al.  Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation , 2010, ACL.

[3]  Mathias Lux,et al.  Lire: lucene image retrieval: an extensible java CBIR library , 2008, ACM Multimedia.

[4]  Anthony K. H. Tung,et al.  Cross Domain Search by Exploiting Wikipedia , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Hua Li,et al.  Enhancing text clustering by leveraging Wikipedia semantics , 2008, SIGIR '08.

[6]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.

[7]  Fei Wang,et al.  Social contextual recommendation , 2012, CIKM.

[8]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[9]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[11]  Ian H. Witten,et al.  An open-source toolkit for mining Wikipedia , 2013, Artif. Intell..

[12]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[13]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[15]  Tat-Seng Chua,et al.  Social-Sensed Image Search , 2014, TOIS.

[16]  Fabio Ciravegna,et al.  Exploring multimedia in a keyword space , 2008, ACM Multimedia.

[17]  Adrian Popescu,et al.  Social media driven image retrieval , 2011, ICMR.