Ranking Users for Intelligent Message Addressing

Finding persons who are knowledgeable on a given topic (i.e. Expert Search) has become an active area of recent research [1, 2, 3]. In this paper we investigate the related task of Intelligent Message Addressing, i.e., finding persons who are potential recipients of a message under composition given its current contents, its previously-specified recipients or a few initial letters of the intended recipient contact (intelligent auto-completion). We begin by providing quantitative evidence, from a very large corpus, of how frequently email users are subject to message addressing problems. We then propose several techniques for this task, including adaptations of well-known formal models of Expert Search. Surprisingly, a simple model based on the K-Nearest-Neighbors algorithm consistently outperformed all other methods. We also investigated combinations of the proposed methods using fusion techniques, which leaded to significant performance improvements over the baselines models. In auto-completion experiments, the proposed models also outperformed all standard baselines. Overall, the proposed techniques showed ranking performance of more than 0.5 in MRR over 5202 queries from 36 different email users, suggesting intelligent message addressing can be a welcome addition to email.

[1]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2]  William W. Cohen,et al.  Preventing Information Leaks in Email , 2007, SDM.

[3]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[4]  Yi Zhang,et al.  Graph-based ranking algorithms for e-mail expertise analysis , 2003, DMKD '03.

[5]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[6]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[7]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[8]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[9]  James P. Callan,et al.  Combining document representations for known-item search , 2003, SIGIR.

[10]  Dino Pedreschi,et al.  Machine Learning: ECML 2004 , 2004, Lecture Notes in Computer Science.

[11]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[12]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[13]  Craig MacDonald,et al.  Voting for candidates: adapting data fusion techniques for an expert search task , 2006, CIKM '06.

[14]  Christopher Joseph Pal CC Prediction with Graphical Models , 2006, CEAS.

[15]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.