Temporal expert finding through generalized time topic modeling

This paper addresses the problem of semantics-based temporal expert finding, which means identifying a person with given expertise for different time periods. For example, many real world applications like reviewer matching for papers and finding hot topics in newswire articles need to consider time dynamics. Intuitively there will be different reviewers and reporters for different topics during different time periods. Traditional approaches used graph-based link structure by using keywords based matching and ignored semantic information, while topic modeling considered semantics-based information without conferences influence (richer text semantics and relationships between authors) and time information simultaneously. Consequently they result in not finding appropriate experts for different time periods. We propose a novel Temporal-Expert-Topic (TET) approach based on Semantics and Temporal Information based Expert Search (STMS) for temporal expert finding, which simultaneously models conferences influence and time information. Consequently, topics (semantically related probabilistic clusters of words) occurrence and correlations change over time, while the meaning of a particular topic almost remains unchanged. By using Bayes Theorem we can obtain topically related experts for different time periods and show how experts' interests and relationships change over time. Experimental results on scientific literature dataset show that the proposed generalized time topic modeling approach significantly outperformed the non-generalized time topic modeling approaches, due to simultaneously capturing conferences influence with time information.

[1]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.

[4]  Andrew McCallum,et al.  Expertise modeling for matching papers with reviewers , 2007, KDD '07.

[5]  Padhraic Smyth,et al.  Algorithms for estimating relative importance in networks , 2003, KDD '03.

[6]  Jie Tang,et al.  Expertise Search in a Time-Varying Social Network , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[7]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Stephen G. Kobourov,et al.  Exploring the computing literature using temporal graph visualization , 2004, IS&T/SPIE Electronic Imaging.

[9]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[10]  Johan Bollen,et al.  Co-authorship networks in the digital library research community , 2005, Inf. Process. Manag..

[11]  C. J. van Rijsbergen,et al.  Investigating the relationship between language model perplexity and IR precision-recall measures , 2003, SIGIR.

[12]  David Hawking,et al.  Challenges in Enterprise Search , 2004, ADC.

[13]  C. Lee Giles,et al.  Clustering and identifying temporal trends in document databases , 2000, Proceedings IEEE Advances in Digital Libraries 2000.

[14]  Wei-Ying Ma,et al.  Web object retrieval , 2007, WWW '07.

[15]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[16]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[17]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[18]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[19]  Juan-Zi Li,et al.  A Mixture Model for Expert Finding , 2008, PAKDD.

[20]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[21]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[22]  Minqiang Li,et al.  Multinomial mixture model with feature selection for text clustering , 2008, Knowl. Based Syst..

[23]  Ramesh Nallapati,et al.  Multiscale topic tomography , 2007, KDD '07.

[24]  LiJuanzi,et al.  Temporal expert finding through generalized time topic modeling , 2010 .

[25]  Juan-Zi Li,et al.  Exploiting Temporal Authors Interests via Temporal-Author-Topic Modeling , 2009, ADMA.

[26]  Michael Gertz,et al.  On the value of temporal information in information retrieval , 2007, SIGF.

[27]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[28]  Juan-Zi Li,et al.  Knowledge discovery through directed probabilistic topic models: a survey , 2010, Frontiers of Computer Science in China.

[29]  Peter Mutschke,et al.  Mining Networks and Central Entities in Digital Libraries. A Graph Theoretic Approach Applied to Co-author Networks , 2003, IDA.

[30]  Juan-Zi Li,et al.  A Generalized Topic Modeling Approach for Maven Search , 2009, APWeb/WAIM.

[31]  Jianping Zeng,et al.  Variable space hidden Markov model for topic detection and analysis , 2007, Knowl. Based Syst..

[32]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[33]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.