Unsupervised, Efficient and Semantic Expertise Retrieval

We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching.

[1]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[2]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[3]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[4]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[5]  Djoerd Hiemstra,et al.  Modeling Documents as Mixtures of Persons for Expert Finding , 2008, ECIR.

[6]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Pável Calado,et al.  Using Rank Aggregation for Expert Search in Academic Digital Libraries , 2015, ArXiv.

[9]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[10]  Loet Leydesdorff,et al.  The Knowledge-Based Economy , 2006 .

[11]  Rüdiger Westermann,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, SIGGRAPH Courses.

[12]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[13]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[14]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[15]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[16]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[17]  J. Kruger,et al.  Unskilled and unaware of it: how difficulties in recognizing one's own incompetence lead to inflated self-assessments. , 1999, Journal of personality and social psychology.

[18]  ChengXiang Zhai,et al.  Estimation of statistical translation models based on mutual information for ad hoc information retrieval , 2010, SIGIR.

[19]  Pat Hanrahan,et al.  Understanding the efficiency of GPU algorithms for matrix-matrix multiplication , 2004, Graphics Hardware.

[20]  M. de Rijke,et al.  A language modeling framework for expert finding , 2009, Inf. Process. Manag..

[21]  Luo Si,et al.  Discriminative models of integrating document evidence and document-candidate associations for expert search , 2010, SIGIR '10.

[22]  M. Fischetti Working knowledge. , 2003, Scientific American.

[23]  Peter Bailey,et al.  Overview of the TREC 2007 Enterprise Track , 2007, TREC.

[24]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[25]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[26]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[27]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[29]  David van Dijk,et al.  Early Detection of Topical Expertise in Community Question Answering , 2015, SIGIR.

[30]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[31]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[32]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[33]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[34]  Jianfeng Gao,et al.  Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[35]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[36]  Philipp Cimiano,et al.  Finding the Right Expert - Discriminative Models for Expert Retrieval , 2011, KDIR.

[37]  Yi Fang,et al.  Modeling the dynamics of personal expertise , 2014, SIGIR.

[38]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[39]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2008, Int. J. Artif. Intell. Tools.

[40]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[41]  Mark T. Maybury,et al.  Expert Finding Systems , 2006 .

[42]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[43]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[44]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[45]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[46]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[47]  Mark S. Ackerman,et al.  Expertise recommender: a flexible recommendation system and architecture , 2000, CSCW '00.

[48]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[49]  Krisztian Balog,et al.  Temporal Expertise Profiling , 2014, ECIR.

[50]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[51]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[52]  Djoerd Hiemstra,et al.  Modeling multi-step relevance propagation for expert finding , 2008, CIKM '08.

[53]  M. de Rijke,et al.  Expertise Retrieval , 2012, Found. Trends Inf. Retr..

[54]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[55]  Krisztian Balog,et al.  People search in the enterprise , 2007, SIGF.

[56]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[57]  Irma Becerra-Fernandez The role of artificial intelligence technologies in the implementation of People-Finder knowledge management systems , 2000, Knowl. Based Syst..

[58]  M. de Rijke,et al.  On the Assessment of Expertise Profiles , 2013, DIR.

[59]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[60]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[61]  H. A. Green,et al.  THE KNOWLEDGE ECONOMY , 2001 .

[62]  Craig MacDonald,et al.  Expert Search Evaluation by Supporting Documents , 2008, ECIR.

[63]  Craig MacDonald,et al.  Voting for candidates: adapting data fusion techniques for an expert search task , 2006, CIKM '06.

[64]  Wolfgang Nejdl,et al.  A Vector Space Model for Ranking Entities and Its Application to Expert Search , 2009, ECIR.

[65]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.