Non-local evidence for expert finding

The task addressed in this paper, finding experts in an enterprise setting, has gained in importance and interest over the past few years. Commonly, this task is approached as an association finding exercise between people and topics. Existing techniques use either documents (as a whole) or proximity-based techniques to represent candidate experts. Proximity-based techniques have shown clear precision-enhancing benefits. We complement both document and proximity-based approaches to expert finding by importing global evidence of expertise, i.e., evidence obtained using information that is not available in the immediate proximity of a candidate expert's name occurrence or even on the same page on which the name occurs. Examples include candidate priors, query models, as well as other documents a candidate expert is associated with. Using the CERC data set created for the TREC 2007 Enterprise track we identify examples of non-local evidence of expertise. We then propose modified expert retrieval models that are capable of incorporating both local (either document or snippet-based) evidence and non-local evidence of expertise. Results show that our refined models significantly outperform existing state-of-the-art approaches.

[1]  Enrico Motta,et al.  The Open University at TREC 2006 Enterprise Track Expert Search Task , 2006, TREC.

[2]  W. Bruce Croft,et al.  Proximity-based document representation for named entity retrieval , 2007, CIKM '07.

[3]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[4]  Stefan M. Rüger,et al.  The Open University at TREC 2007 Enterprise Track , 2007, TREC.

[5]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[6]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[7]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[8]  Iadh Ounis,et al.  A BELIEF NETWORK MODEL FOR EXPERT SEARCH , 2007 .

[9]  Djoerd Hiemstra,et al.  University of Twente at the TREC 2007 Enterprise Track: Modeling Relevance Propagation for the Expert Search Task , 2007, TREC.

[10]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[11]  Djoerd Hiemstra,et al.  Modeling Documents as Mixtures of Persons for Expert Finding , 2008, ECIR.

[12]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[13]  David J. C. MacKay,et al.  A hierarchical Dirichlet language model , 1995, Natural Language Engineering.

[14]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[15]  Peter Bailey,et al.  Overview of the TREC 2007 Enterprise Track , 2007, TREC.

[16]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[17]  Nick Craswell,et al.  Overview of the TREC 2006 Enterprise Track , 2006, TREC.

[18]  M. de Rijke,et al.  A language modeling framework for expert finding , 2009, Inf. Process. Manag..

[19]  Craig MacDonald,et al.  Voting techniques for expert search , 2008, Knowledge and Information Systems.

[20]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2008, Int. J. Artif. Intell. Tools.

[21]  Maarten de Rijke,et al.  Associating People and Documents , 2008, ECIR.

[22]  Craig MacDonald,et al.  High Quality Expertise Evidence for Expert Search , 2008, ECIR.

[23]  Krisztian Balog,et al.  People search in the enterprise , 2007, SIGF.

[24]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[25]  M. de Rijke,et al.  A few examples go a long way: constructing query models from elaborate query formulations , 2008, SIGIR '08.

[26]  Chirag Shah,et al.  Evaluating high accuracy retrieval techniques , 2004, SIGIR '04.

[27]  Peter Bailey,et al.  The CSIRO enterprise search test collection , 2007, SIGF.

[28]  Craig MacDonald,et al.  Expertise drift and query expansion in expert search , 2007, CIKM '07.

[29]  Djoerd Hiemstra,et al.  Entity Ranking on Graphs: Studies on Expert Finding , 2007 .

[30]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.