A language modeling framework for expert finding

Statistical language models have been successfully applied to many information retrieval tasks, including expert finding: the process of identifying experts given a particular topic. In this paper, we introduce and detail language modeling approaches that integrate the representation, association and search of experts using various textual data sources into a generative probabilistic framework. This provides a simple, intuitive, and extensible theoretical framework to underpin research into expertise search. To demonstrate the flexibility of the framework, two search strategies to find experts are modeled that incorporate different types of evidence extracted from the data, before being extended to also incorporate co-occurrence information. The models proposed are evaluated in the context of enterprise search systems within an intranet environment, where it is reasonable to assume that the list of experts is known, and that data to be mined is publicly accessible. Our experiments show that excellent performance can be achieved by using these models in such environments, and that this theoretical and empirical work paves the way for future principled extensions.

[1]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.

[2]  Irma Becerra-Fernandez The role of artificial intelligence technologies in the implementation of People-Finder knowledge management systems , 2000, Knowl. Based Syst..

[3]  Mark T. Maybury,et al.  Expert Finding Systems , 2006 .

[4]  Djoerd Hiemstra,et al.  Modeling Documents as Mixtures of Persons for Expert Finding , 2008, ECIR.

[5]  David J. C. MacKay,et al.  A hierarchical Dirichlet language model , 1995, Natural Language Engineering.

[6]  Maarten de Rijke,et al.  Finding Key Bloggers, One Post At A Time , 2008, ECAI.

[7]  Peter Bailey,et al.  Overview of the TREC 2007 Enterprise Track , 2007, TREC.

[8]  W. Bruce Croft,et al.  Proximity-based document representation for named entity retrieval , 2007, CIKM '07.

[9]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[10]  Bo Peng,et al.  CNDS Expert Finding System for TREC 2005 , 2005, TREC.

[11]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[12]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[13]  Djoerd Hiemstra,et al.  Entity Ranking on Graphs: Studies on Expert Finding , 2007 .

[14]  Yiqun Liu,et al.  THUIR at TREC 2005: Enterprise Track , 2005, TREC.

[15]  Leif Azzopardi,et al.  Probabilistic hyperspace analogue to language , 2005, SIGIR '05.

[16]  Nick Craswell,et al.  Overview of the TREC 2006 Enterprise Track , 2006, TREC.

[17]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[18]  Walter F. Tichy,et al.  Proceedings 25th International Conference on Software Engineering , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[19]  van Gerardus Noord,et al.  Special issue: finite state methods in natural language processing , 2003 .

[20]  Dawit Yimam,et al.  Expert Finding Systems for Organizations: Domain Analysis and The DEMOIR Approach , 1999 .

[21]  Alfred Kobsa,et al.  Expert-Finding Systems for Organizations: Problem and Domain Analysis and the DEMOIR Approach , 2003, J. Organ. Comput. Electron. Commer..

[22]  Thomas H. Davenport,et al.  Book review:Working knowledge: How organizations manage what they know. Thomas H. Davenport and Laurence Prusak. Harvard Business School Press, 1998. $29.95US. ISBN 0‐87584‐655‐6 , 1998 .

[23]  Maarten de Rijke,et al.  Associating People and Documents , 2008, ECIR.

[24]  Yiqun Liu,et al.  THUIR at TREC 2008: Blog Track , 2008, TREC.

[25]  Mounia Lalmas,et al.  SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval , 2006 .

[26]  Enrico Motta,et al.  The Open University at TREC 2006 Enterprise Track Expert Search Task , 2006, TREC.

[27]  Peter Bailey,et al.  Overview of the TREC 2007 Enterprise Track | NIST , 2008 .

[28]  M. de Rijke,et al.  Finding similar experts , 2007, SIGIR.

[29]  Krisztian Balog,et al.  People search in the enterprise , 2007, SIGF.

[30]  de RijkeMaarten,et al.  A language modeling framework for expert finding , 2009 .

[31]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[32]  Iadh Ounis,et al.  A BELIEF NETWORK MODEL FOR EXPERT SEARCH , 2007 .

[33]  Craig MacDonald,et al.  Voting techniques for expert search , 2008, Knowledge and Information Systems.

[34]  Craig MacDonald,et al.  High Quality Expertise Evidence for Expert Search , 2008, ECIR.

[35]  Yaojie Lu,et al.  Ricoh Research at TREC 2006: Enterprise Track , 2006, TREC.

[36]  M. de Rijke,et al.  Determining Expert Profiles (With an Application to Expert Finding) , 2007, IJCAI.

[37]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[38]  Audris Mockus,et al.  Expertise Browser: a quantitative approach to identifying expertise , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[39]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[40]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[41]  Yiqun Liu,et al.  THUIR at TREC 2007: Enterprise Track , 2007, TREC.

[42]  Donna Harman,et al.  Multi-task multi-modality SVM for early COVID-19 Diagnosis using chest CT data , 2021, Information Processing & Management.

[43]  Yong Yu,et al.  Research on Enterprise Track of TREC 2007 at SJTU APEX Lab , 2007, TREC.

[44]  Stefan M. Rüger,et al.  The Open University at TREC 2007 Enterprise Track , 2007, TREC.

[45]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.