Relational term-suggestion graphs incorporating multipartite concept and expertise networks

Term suggestions recommend query terms to a user based on his initial query. Suggesting adequate terms is a challenging issue. Most existing commercial search engines suggest search terms based on the frequency of prior used terms that match the leading alphabets the user types. In this article, we present a novel mechanism to construct semantic term-relation graphs to suggest relevant search terms in the semantic level. We built term-relation graphs based on multipartite networks of existing social media, especially from Wikipedia. The multipartite linkage networks of contributor-term, term-category, and term-term are extracted from Wikipedia to eventually form term relation graphs. For fusing these multipartite linkage networks, we propose to incorporate the contributor-category networks to model the expertise of the contributors. Based on our experiments, this step has demonstrated clear enhancement on the accuracy of the inferred relatedness of the term-semantic graphs. Experiments on keyword-expanded search based on 200 TREC-5 ad-hoc topics showed obvious advantage of our algorithms over existing approaches.

[1]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2]  Monika Henzinger,et al.  Hyperlink Analysis for the Web , 2001, IEEE Internet Comput..

[3]  Peter Sch Identifying document topics using the Wikipedia category network , 2006 .

[4]  Hai Yang,et al.  ACM Transactions on Intelligent Systems and Technology - Special Section on Urban Computing , 2014 .

[5]  Ricardo A. Baeza-Yates,et al.  Extracting semantic relations from query logs , 2007, KDD '07.

[6]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[7]  Laurianne Sitbon,et al.  Tensor query expansion: A cognitively motivated relevance model , 2011 .

[8]  Dominik Benz,et al.  The social bookmark and publication management system bibsonomy , 2010, The VLDB Journal.

[9]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[10]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[11]  C. Fellbaum An Electronic Lexical Database , 1998 .

[12]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[13]  Mark T. Maybury Human Language Technologies for Knowledge Management , 2001, HTLKM@ACL.

[14]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[15]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[16]  Ching-Yung Lin,et al.  Building term suggestion relational graphs from collective intelligence , 2009, WWW '09.

[17]  Pavel Serdyukov,et al.  Enterprise and desktop search , 2010, WWW '10.

[18]  Key-Sun Choi,et al.  A Comparison of Collocation-Based Similarity Measures in Query Expansion , 1999, Inf. Process. Manag..

[19]  Ying Wang,et al.  A study of the effect of term proximity on query expansion , 2006, J. Inf. Sci..

[20]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[21]  Jian Hu,et al.  Using Wikipedia for Co-clustering Based Cross-Domain Text Classification , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[22]  Evgeniy Gabrilovich,et al.  Feature Generation for Text Categorization Using World Knowledge , 2005, IJCAI.

[23]  Eduardo Mena,et al.  Web-Based Measure of Semantic Relatedness , 2008, WISE.

[24]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[25]  Yun Chi,et al.  Identifying opinion leaders in the blogosphere , 2007, CIKM '07.

[26]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[27]  Carlotta Domeniconi,et al.  Building semantic kernels for text classification using wikipedia , 2008, KDD.

[28]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[29]  Padhraic Smyth,et al.  Algorithms for estimating relative importance in networks , 2003, KDD '03.

[30]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[31]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[32]  Ian H. Witten,et al.  A knowledge-based search engine powered by wikipedia , 2007, CIKM '07.

[33]  M Girvan,et al.  Structure of growing social networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[35]  Jae Yun Lee,et al.  A corpus-based approach to comparative evaluation of statistical term association measures , 2001, J. Assoc. Inf. Sci. Technol..

[36]  Volker Tresp,et al.  Soft Clustering on Graphs , 2005, NIPS.

[37]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[38]  Gordon I. McCalla,et al.  User Modelling in I-Help: What, Why, When and How , 2001, User Modeling.

[39]  Hao Wu,et al.  Suggesting Topic-Based Query Terms as You Type , 2010, 2010 12th International Asia-Pacific Web Conference.

[40]  Mark Baillie,et al.  Tripartite Hidden Topic Models for Personalised Tag Suggestion , 2010, ECIR.

[41]  Eneko Agirre,et al.  WikiWalk: Random walks on Wikipedia for Semantic Relatedness , 2009, Graph-based Methods for Natural Language Processing.

[42]  Frank van Harmelen,et al.  Using Google distance to weight approximate ontology matches , 2007, WWW '07.

[43]  Haifeng Wang,et al.  Paraphrasing with Search Engine Query Logs , 2010, COLING.

[44]  Qiang Yang,et al.  Bridging Domains Using World Wide Knowledge for Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[45]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[46]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[47]  ShiehJyh-Ren,et al.  Relational term-suggestion graphs incorporating multipartite concept and expertise networks , 2014 .

[48]  Péter Schönhofen,et al.  Identifying Document Topics Using the Wikipedia Category Network , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[49]  Michael R. Lyu,et al.  Learning latent semantic relations from clickthrough data for query suggestion , 2008, CIKM '08.