Learning Aggregation Functions for Expert Search

Machine learning techniques are increasingly being applied to problems in the domain of information retrieval and text mining. In this paper we present an application of evolutionary computation to the area of expert search. Expert search in the context of enterprise information systems deals with the problem of finding and ranking candidate experts given an information need (query). A difficult problem in the area of expert search is finding relevant information given an information need and associating that information with a potential expert. We attempt to improve the effectiveness of a benchmark expert search approach by adopting a learning model (genetic programming) that learns how to aggregate the documents/information associated with each expert. In particular, we perform an analysis of the aggregation of document information and show that different numbers of documents should be aggregated for different queries in order to achieve optimal performance. We then attempt to learn a function that optimises the effectiveness of an expert search system by aggregating different numbers of documents for different queries. Furthermore, we also present experiments for an approach that aims to learn the best way to aggregate documents for individual experts. We find that substantial improvements in performance can be achieved, over standard analytical benchmarks, by the latter of these approaches.

[1]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[2]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[3]  Maarten de Rijke,et al.  Associating People and Documents , 2008, ECIR.

[4]  Wei Lu,et al.  CSIR at TREC 2007 Expert Search Task , 2007, TREC.

[5]  Ronan Cummins,et al.  Evolving local and global weighting schemes in information retrieval , 2006, Information Retrieval.

[6]  Michael D. Gordon Probabilistic and genetic algorithms in document retrieval , 1988, CACM.

[7]  Alfred Kobsa,et al.  Expert-Finding Systems for Organizations: Problem and Domain Analysis and the DEMOIR Approach , 2003, J. Organ. Comput. Electron. Commer..

[8]  Craig MacDonald,et al.  Searching for Expertise: Experiments with the Voting Model , 2009, Comput. J..

[9]  Nick Craswell,et al.  Overview of the TREC 2006 Enterprise Track , 2006, TREC.

[10]  W. Bruce Croft,et al.  Proximity-based document representation for named entity retrieval , 2007, CIKM '07.

[11]  Craig MacDonald,et al.  Voting for candidates: adapting data fusion techniques for an expert search task , 2006, CIKM '06.

[12]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[13]  Iadh Ounis,et al.  Query performance prediction , 2006, Inf. Syst..

[14]  Edward A. Fox,et al.  Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval , 2004, SIGIR '04.

[15]  Djoerd Hiemstra,et al.  A survey of pre-retrieval query performance predictors , 2008, CIKM '08.