Modeling and exploiting heterogeneous bibliographic networks for expertise ranking

Recently expertise retrieval has received increasing interests in both academia and industry. Finding experts with demonstrated expertise for a given query is a nontrivial task especially from a large-scale Web 2.0 systems, such as question answering and bibliography data, where users are actively publishing useful content online, interacting with each other, and forming social networks in various ways, leading to heterogeneous networks in addition to the large amounts of textual content information. Many approaches have been proposed and shown to be useful for expertise ranking. However, most of these methods only consider the textual documents while ignoring heterogeneous network structures or can merely integrate with one additional kind of information. None of them can fully exploit the characteristics of heterogeneous networks. In this paper, we propose a joint regularization framework to enhance expertise retrieval by modeling heterogeneous networks as regularization constraints on top of document-centric model. We argue that multi-typed linking edges reveal valuable information which should be treated differently. Motivated by this intuition, we formulate three hypotheses to capture unique characteristics for different graphs, and mathematically model those hypotheses jointly with the document and other information. To illustrate our methodology, we apply the framework to expert finding applications using a bibliography dataset with 1.1 million papers and 0.7 million authors. The experimental results show that our proposed approach can achieve significantly better results than the baseline and other enhanced models.

[1]  Yue Lu,et al.  Exploiting social context for review quality prediction , 2010, WWW '10.

[2]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[3]  Hongbo Deng,et al.  Formal Models for Expert Finding on DBLP Bibliography Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  Fernando Diaz,et al.  Regularizing ad hoc retrieval scores , 2005, CIKM '05.

[5]  Christoph Meinel,et al.  Telling experts from spammers: expertise ranking in folksonomies , 2009, SIGIR.

[6]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[7]  Ryen W. White,et al.  Enhancing Expert Finding Using Organizational Hierarchies , 2009, ECIR.

[8]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[9]  Nick Craswell,et al.  Overview of the TREC 2006 Enterprise Track , 2006, TREC.

[10]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[11]  Hongbo Deng,et al.  Enhanced Models for Expertise Retrieval Using Community-Aware Strategies , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[13]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[14]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[15]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[16]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[17]  Hongyuan Zha,et al.  Co-ranking Authors and Documents in a Heterogeneous Network , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[18]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[19]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[20]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[21]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[22]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[23]  Krisztian Balog,et al.  A User-Oriented Model for Expert Finding , 2011, ECIR.

[24]  Chun Chen,et al.  Personalized tag recommendation using graph-based ranking on multi-type interrelated objects , 2009, SIGIR.

[25]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[26]  Craig MacDonald,et al.  Voting for candidates: adapting data fusion techniques for an expert search task , 2006, CIKM '06.

[27]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[28]  W. Bruce Croft,et al.  Finding experts in community-based question-answering services , 2005, CIKM '05.

[29]  Johan Bollen,et al.  Co-authorship networks in the digital library research community , 2005, Inf. Process. Manag..

[30]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.