Modeling multi-step relevance propagation for expert finding

An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a novel approach to expert finding in large enterprises or intranets by modeling candidate experts (persons), web documents and various relations among them with so-called expertise graphs. As distinct from the state of-the-art approaches estimating personal expertise through one-step propagation of relevance probability from documents to the related candidates, our methods are based on the principle of multi-step relevance propagation in topic specific expertise graphs. We model the process of expert finding by probabilistic random walks of three kinds: finite, infinite and absorbing. Experiments on TREC Enterprise Track data originating from two large organizations show that our methods using multi-step relevance propagation improve over the baseline one-step propagation based method in almost all cases.

[1]  Djoerd Hiemstra,et al.  Being Omnipresent To Be Almighty: The Importance of The Global Web Evidence for Organizational Expert Finding , 2008 .

[2]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[3]  Michael Idinopulos,et al.  Do you Know who your Experts are , 2006 .

[4]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[5]  Volker Wulf,et al.  Sharing Expertise: Beyond Knowledge Management , 2002 .

[6]  Irma Becerra-Fernandez Facilitating the Online Search of Experts at NASA using Expert Seeker People-Finder , 2000, PAKM.

[7]  Djoerd Hiemstra,et al.  Modeling expert finding as an absorbing random walk , 2008, SIGIR '08.

[8]  Shlomo Moran,et al.  SALSA: the stochastic approach for link-structure analysis , 2001, TOIS.

[9]  Djoerd Hiemstra,et al.  Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah , 2008, INEX.

[10]  W. Bruce Croft,et al.  Finding experts in community-based question-answering services , 2005, CIKM '05.

[11]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[12]  Djoerd Hiemstra,et al.  Modeling Documents as Mixtures of Persons for Expert Finding , 2008, ECIR.

[13]  Marc Najork,et al.  Hits on the web: how does it compare? , 2007, SIGIR.

[14]  Azadeh Shakery,et al.  A probabilistic relevance propagation model for hypertext retrieval , 2006, CIKM '06.

[15]  Fabio Crestani,et al.  Application of Spreading Activation Techniques in Information Retrieval , 1997, Artificial Intelligence Review.

[16]  Ryen W. White,et al.  Mining the search trails of surfing crowds: identifying relevant websites from user activity , 2008, WWW.

[17]  David Hawking,et al.  Challenges in Enterprise Search , 2004, ADC.

[18]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[19]  Djoerd Hiemstra,et al.  PFTijah: text search in an XML database system , 2006 .

[20]  Djoerd Hiemstra,et al.  Exploiting sequential dependencies for expert finding , 2008, SIGIR '08.

[21]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[22]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[23]  W. Bruce Croft,et al.  Proximity-based document representation for named entity retrieval , 2007, CIKM '07.

[24]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[25]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[26]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[27]  Thomas H. Davenport,et al.  Ten principles of knowledge management and four case studies , 1997 .

[28]  Maarten de Rijke,et al.  Finding experts and their eetails in e-mail corpora , 2006, WWW '06.

[29]  Stephen E. Robertson,et al.  Window-based Enterprise Expert Search , 2006, TREC.

[30]  Andrew Y. Ng,et al.  Learning random walk models for inducing word dependency distributions , 2004, ICML.

[31]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.

[32]  Oren Kurland,et al.  Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models , 2006, SIGIR.

[33]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[34]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[35]  Djoerd Hiemstra,et al.  Generative modeling of persons and documents for expert search , 2007, SIGIR.

[36]  Ching-Yung Lin,et al.  Personalized recommendation driven by information flow , 2006, SIGIR.

[37]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[38]  Sanda M. Harabagiu,et al.  Answering complex questions with random walk models , 2006, SIGIR '06.

[39]  Giuseppe Attardi,et al.  Ranking very many typed entities on wikipedia , 2007, CIKM '07.

[40]  Mark T. Maybury,et al.  Expert Finding Systems , 2006 .

[41]  Haiqiang Chen,et al.  Social Network Structure Behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track , 2006, TREC.

[42]  Craig MacDonald,et al.  Voting for candidates: adapting data fusion techniques for an expert search task , 2006, CIKM '06.

[43]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[44]  Kevyn Collins-Thompson,et al.  Query expansion using random walk models , 2005, CIKM '05.

[45]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[46]  Eugene Agichtein,et al.  Discovering authorities in question answer communities by using link analysis , 2007, CIKM '07.