Finding Experts Using Social Network Analysis

Searching an organization's document repositories for experts is a frequently occurred problem in intranet information management. A common method for finding experts in an organization is to use social networks - people are not isolated but connected by various kinds of associations. In organizations, people explicitly send email to one another thus social networks are likely to be contained in the patterns of communication. Moreover, in some web pages, the relationship among people is also recorded. In our approach we propose several strategies in discovering the associations among people from emails and web pages. Based on the social networks, we proposed an expertise propagation algorithm: from a ranked list of candidates according to their probability of being expert for a certain topic, we select a small set of the top ones as seed, and then use the social networks among the candidates to discover other potential experts. The experiments on TREC enterprise track show significant performance improvement with the algorithm.

[1]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[2]  Yiqun Liu,et al.  A PDD-Based Searching Approach for Expert Finding in Intranet Information Management , 2006, AIRS.

[3]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[4]  Jayant Madhavan,et al.  Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[5]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[6]  David E. Millard,et al.  Automatic Ontology-Based Knowledge Extraction from Web Documents , 2003, IEEE Intell. Syst..

[7]  Shankar Kumar,et al.  Normalization of Non-Standard Words: WS '99 Final Report , 1999 .

[8]  Haiqiang Chen,et al.  Social Network Structure Behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track , 2006, TREC.

[9]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[10]  Amit P. Sheth,et al.  Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content , 2002 .

[11]  Jie Tang,et al.  Information Extraction: Methodologies and Applications , 2008 .

[12]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[13]  Kun Yu,et al.  Resume Information Extraction with Cascaded Hybrid Model , 2005, ACL.

[14]  Changning Huang,et al.  A Unified Statistical Model for the Identification of English BaseNP , 2000, ACL.

[15]  Maarten de Rijke,et al.  Finding experts and their eetails in e-mail corpora , 2006, WWW '06.

[16]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[17]  Fabio Ciravegna,et al.  (LP) 2 , an Adaptive Algorithm for Information Extraction from Web-related Texts , 2001 .

[18]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[19]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[20]  Jie Tang,et al.  Email data cleaning , 2005, KDD '05.

[21]  Paul A. Viola,et al.  Interactive Information Extraction with Constrained Conditional Random Fields , 2004, AAAI.

[22]  Dan Brickley,et al.  FOAF Vocabulary Specification , 2004 .

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[24]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[25]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.