Email pragmatics and automatic classification: A study in the organizational context

This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email. © 2012 Wiley Periodicals, Inc.

[1]  William W. Cohen,et al.  Improving “Email Speech Acts” Analysis via N-gram Selection , 2006, HLT-NAACL 2006.

[2]  Paul N. Bennett,et al.  Detecting action-items in e-mail , 2005, SIGIR '05.

[3]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[4]  Geert Jacobs A Pragmatic Perspective on Press Releases , 1999 .

[5]  Carman Neustaedter,et al.  Beyond "from" and "received": exploring the dynamics of email triage , 2005, CHI Extended Abstracts.

[6]  Olle Bälter,et al.  Bifrost inbox organizer: giving users control over the inbox , 2002, NordiCHI '02.

[7]  Terry Winograd,et al.  A Language/Action Perspective on the Design of Cooperative Work , 1987, SGCH.

[8]  Maureen L. Mackenzie The Personal Organization of Electronic Mail Messages in a Business Environment: An Exploratory Study , 2000 .

[9]  Jade Goldstein-Stewart,et al.  Using Speech Acts to Categorize Email and Identify Email Genres , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[10]  Siegfried Handschuh,et al.  Improving Email Conversation Efficiency through Semantically Enhanced Email , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[11]  Robert E. Kraut,et al.  Understanding email use: predicting action on a message , 2005, CHI.

[12]  Marco Porta,et al.  A System for Database Visual Querying and Query Visualization: Complementing Text and Graphics to Increase Expressiveness , 2007 .

[13]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[14]  Mark Dredze,et al.  Managers' email: beyond tasks and to-dos , 2005, CHI EA '05.

[15]  Dominic Forest Application de techniques de forage de textes de nature prédictive et exploratoire à des fins de gestion et d'analyse thématique de documents textuels non structurés , 2006 .

[16]  Inge Alberts,et al.  Exploitation des genres de textes pour assister les pratiques textuelles dans les environnements numériques de travail : le cas du courriel chez des cadres et des secrétaires dans une municipalité et une administration fédérale canadiennes , 2009 .

[17]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[18]  Eric Horvitz,et al.  Attention-Sensitive Alerting , 1999, UAI.

[19]  Cécile Paris,et al.  Detecting Emails Containing Requests for Action , 2010, NAACL.

[20]  François Rastier,et al.  Arts et sciences du texte , 2001 .

[21]  C. Kerbrat-Orecchioni,et al.  Les actes de langage dans le discours : théorie et fonctionnement , 2001 .

[22]  Marcel Kvassay,et al.  Email Analysis and Information Extraction for Enterprise Benefit , 2011, Comput. Informatics.

[23]  Siegfried Handschuh,et al.  Improving Email Conversation Efficiency through Semantically Enhanced Email , 2007 .

[24]  Anoop Gupta,et al.  Supporting Email Workflow , 2001 .

[25]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .

[26]  Robert E. Kraut,et al.  Email overload at work: an analysis of factors associated with email strain , 2006, IEEE Engineering Management Review.

[27]  Victoria Bellotti,et al.  E-mail as habitat: an exploration of embedded personal information management , 2001, INTR.