论文信息 - Integrating multiple windows and document features for expert finding

Integrating multiple windows and document features for expert finding

Expert finding is a key task in enterprise search and has recently attracted lots of attention from both research and industry communities. Given a search topic, a prominent existing approach is to apply some information retrieval (IR) system to retrieve top ranking documents, which will then be used to derive associations between experts and the search topic based on cooccurrences. However, we argue that expert finding is more sensitive to multiple levels of associations and document features that current expert finding systems insufficiently address, including (a) multiple levels of associations between experts and search topics, (b) document internal structure, and (c) document authority. We propose a novel approach that integrates the above-mentioned three aspects as well as a query expansion technique in a two-stage model for expert finding. A systematic evaluation is conducted on TREC collections to test the performance of our approach as well as the effects of multiple windows, document features, and query expansion. These experimental results show that query expansion can dramatically improve expert finding performance with statistical significance. For three well-known IR models with or without query expansion, document internal structures help improve a single window-based approach but without statistical significance, while our novel multiple window-based approach can significantly improve the performance of a single window-based approach both with and without document internal structures. © 2009 Wiley Periodicals, Inc.

Dawei Song | Stefan Rüger | Jianhan Zhu

[1] M. de Rijke,et al. Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[2] Stephen E. Robertson,et al. Query Expansion with Long-Span Collocates , 2003, Information Retrieval.

[3] Stephen E. Robertson,et al. Relevance weighting for query independent evidence , 2005, SIGIR '05.

[4] Enrico Motta,et al. The Open University at TREC 2006 Enterprise Track Expert Search Task , 2006, TREC.

[5] Peter Bruza,et al. Towards context sensitive information inference , 2003, J. Assoc. Inf. Sci. Technol..

[6] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7] W. Bruce Croft,et al. Formal multiple-bernoulli models for language modeling , 2004, SIGIR '04.

[8] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9] Wei Lu,et al. Using Document Weight Combining Method for Enterprise Expert Mining , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[10] M. de Rijke,et al. Determining Expert Profiles (With an Application to Expert Finding) , 2007, IJCAI.

[11] Paul P. Maglio,et al. Expertise identification using email communications , 2003, CIKM '03.

[12] Doug Downey,et al. Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison , 2004, AAAI.

[13] W. Bruce Croft,et al. Proximity-based document representation for named entity retrieval , 2007, CIKM '07.

[14] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[15] Alfred V. Aho,et al. Efficient string matching , 1975, Commun. ACM.

[16] Bo Peng,et al. CNDS Expert Finding System for TREC 2005 , 2005, TREC.

[17] In-Ho Kang,et al. Query type classification for web document retrieval , 2003, SIGIR.

[18] Jack G. Conrad,et al. A system for discovering relationships by feature extraction from text databases , 1994, SIGIR '94.

[19] ChengXiang Zhai,et al. Probabilistic Models for Expert Finding , 2007, ECIR.

[20] Kevin Chen-Chuan Chang,et al. EntityRank: Searching Entities Directly and Holistically , 2007, VLDB.

[21] Nick Craswell,et al. Overview of the TREC 2006 Enterprise Track , 2006, TREC.