Learning to Diversify Expert Finding with Subtopics

Expert finding is concerned about finding persons who are knowledgeable on a given topic. It has many applications in enterprise search, social networks, and collaborative management. In this paper, we study the problem of diversification for expert finding. Specifically, employing an academic social network as the basis for our experiments, we aim to answer the following question: Given a query and an academic social network, how to diversify the ranking list, so that it captures the whole spectrum of relevant authors' expertise? We precisely define the problem and propose a new objective function by incorporating topic-based diversity into the relevance ranking measurement. A learning-based model is presented to solve the objective function. Our empirical study in a real system validates the effectiveness of the proposed method, which can achieve significant improvements (+15.3%-+94.6% by MAP) over alternative methods.

[1]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[2]  Nick Craswell,et al.  Overview of the TREC 2006 Enterprise Track , 2006, TREC.

[3]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[4]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[5]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[6]  Filip Radlinski,et al.  Redundancy, diversity and interdependent document relevance , 2009, SIGF.

[7]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[8]  Wei-Pang Yang,et al.  Learning to Rank for Information Retrieval Using Genetic Programming , 2007 .

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[11]  Bo Gao,et al.  Topic-level social network search , 2011, KDD.

[12]  Jie Tang,et al.  A Combination Approach to Web User Profiling , 2010, TKDD.

[13]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[14]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[15]  Wei-Ying Ma,et al.  Web object retrieval , 2007, WWW '07.

[16]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[17]  A. Mathur,et al.  Ranking Experts with Discriminative Probabilistic Models , 2009 .

[18]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[19]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[20]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[21]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[22]  Jingrui He,et al.  Diversified ranking on large graphs: an optimization viewpoint , 2011, KDD.

[23]  Yossi Matias,et al.  Suggesting friends using the implicit social graph , 2010, KDD.

[24]  Ruoming Jin,et al.  A Topic Modeling Approach and Its Integration into the Random Walk Framework for Academic Search , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[25]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[26]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[27]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[28]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.