Assessing e-mail intent and tasks in e-mail messages

In this paper, we analyze corporate e-mail messages as a medium to convey work tasks. Research indicates that categorization of e-mail could alleviate the common problem of information overload. Although e-mail clients provide possibilities of e-mail categorization, not many users spend effort on proper e-mail management. Since e-mail clients are often used for task management, we argue that intent- and task-based categorizations might be what is missing from current systems. We propose a taxonomy of tasks that are expressed through e-mail messages. With this taxonomy, we manually annotated two e-mail datasets (Enron and Avocado), and evaluated the validity of the dimensions in the taxonomy. Furthermore, we investigated the potential for automatic e-mail classification in a machine learning experiment. We found that approximately half of the corporate e-mail messages contain at least one task, mostly informational or procedural in nature. We show that automatic detection of the number of tasks in an e-mail message is possible with 71% accuracy. One important finding is that it is possible to use the e-mails from one company to train a classifier to classify e-mails from another company. Detecting how many tasks a message contains, whether a reply is expected, or what the spatial and time sensitivity of such a task is, can help in providing a more detailed priority estimation of the message for the recipient. Such a priority-based categorization can support knowledge workers in their battle against e-mail overload. © 2016 Elsevier Inc. All rights reserved.

[1]  Andrew McCallum,et al.  Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and SRI Corpora , 2005 .

[2]  Gillian Ragsdell,et al.  Information overload: The differences that age makes , 2016, J. Libr. Inf. Sci..

[3]  David Bawden,et al.  The dark side of information: overload, anxiety and other paradoxes and pathologies , 2009, J. Inf. Sci..

[4]  Cécile Paris,et al.  Requests and Commitments in Email are More Complex Than You Think: Eight Reasons to be Cautious , 2008, ALTA.

[5]  Wessel Kraaij,et al.  Reliability and Validity of Query Intent Assessments , 2013, DIR.

[6]  Anna L. Cox,et al.  "I check my emails on the toilet": Email Practices and Work-Home Boundary Management , 2014, CHI 2014.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Andrew Slater,et al.  The Learning Behind Gmail Priority Inbox , 2010 .

[9]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[10]  Kristina Lerman,et al.  Evolution of Conversations in the Age of Email Overload , 2015, WWW.

[11]  Wessel Kraaij,et al.  Combining textual and non-textual features for e-mail importance estimation , 2013 .

[12]  Yoelle Maarek,et al.  How Many Folders Do You Really Need?: Classifying Email into a Handful of Categories , 2014, CIKM.

[13]  Yoram M. Kalman,et al.  Filing, piling, and everything in between: The dynamics of E‐mail inbox management , 2015, J. Assoc. Inf. Sci. Technol..

[14]  David B. Martin,et al.  Attending to Email , 2016, Interact. Comput..

[15]  Edo Liberty,et al.  Automatically tagging email by leveraging other users' folders , 2011, KDD.

[16]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[17]  Sharma Chakravarthy,et al.  A Graph-Based Approach for Multi-folder Email Classification , 2010, 2010 IEEE International Conference on Data Mining.

[18]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[19]  Fei Xia,et al.  Email Formality in the Workplace: A Case Study on the Enron Corpus , 2011 .

[20]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[21]  John C. Tang,et al.  When Can I Expect an Email Response? A Study of Rhythms in Email Usage , 2003, ECSCW.

[22]  Jonathan Gains Electronic Mail--A New Style of Communication or Just a New Medium? An Investigation into the Text Features of E-Mail. , 1999 .

[23]  Wessel Kraaij,et al.  E-mail categorization using partially related training examples , 2014, IIiX.

[24]  John Blitzer,et al.  Intelligent email: reply and attachment prediction , 2008, IUI '08.

[25]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[26]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[27]  Mark Dredze,et al.  Automatically classifying emails into activities , 2006, IUI '06.

[28]  Jeffrey O. Kephart,et al.  MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[29]  Candace L. Sidner,et al.  Email overload: exploring personal information management of email , 1996, CHI.

[30]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[31]  José Angel Olivas,et al.  FzMail: Using FIS-CRM for E-mail Classification , 2007, J. Adv. Comput. Intell. Intell. Informatics.