Learning to Classify Email into “Speech Acts”

It is often useful to classify email according to the intent of the sender (e.g., "propose a meeting", "deliver information"). We present experimental results in learning to classify email in this fashion, where each class corresponds to a verbnoun pair taken from a predefined ontology describing typical “email speech acts”. We demonstrate that, although this categorization problem is quite different from “topical” text classification, certain categories of messages can nonetheless be detected with high precision (above 80%) and reasonable recall (above 50%) using existing text-classification learning methods. This result suggests that useful task-tracking tools could be constructed based on automatic classification into this taxonomy.

[1]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[2]  William W. Cohen,et al.  Learning to Extract Signature and Reply Lines from Email , 2004, CEAS.

[3]  Stan Matwin,et al.  Feature Engineering for Text Classification , 1999, ICML.

[4]  Loren G. Terveen,et al.  ContactMap: using personal social networks to organize communication in a social desktop , 2002, CSCW '02.

[5]  Patrick Henry Winston,et al.  Representation and Learning , 1982 .

[6]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[7]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[8]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[9]  Allen E. Milewski,et al.  Replying to email with structured responses , 1998, Int. J. Hum. Comput. Stud..

[10]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[11]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[12]  Lori S. Levin,et al.  CLARITY: INFERRING DISCOURSE STRUCTURE FROM SPEECH , 2002 .

[13]  Mareike Schoop,et al.  An introduction to the language-action perspective , 2001, SIGG.

[14]  J. Searle Expression and Meaning: A taxonomy of illocutionary acts , 1975 .

[15]  Jeffrey O. Kephart,et al.  SwiftFile: An Intelligent Assistant for Organizing E-Mail , 2000 .

[16]  Jason D. M. Rennie ifile: An Application of Machine Learning to E-Mail Filtering , 2000 .

[17]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[18]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[19]  Fernando Flores,et al.  DOING AND SPEAKING IN THE OFFICE , 1980 .

[20]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[21]  Thorsten Joachims,et al.  A Statistical Learning Model of Text Classification for Support Vector Machines. , 2001, SIGIR 2002.

[22]  Allen E. Milewski,et al.  An experimental system for transactional messaging , 1997, GROUP.

[23]  Stefan Wermter,et al.  Learning dialog act processing , 1996, COLING.

[24]  Shlomo Argamon,et al.  Style mining of electronic messages for multiple authorship discrimination: first results , 2003, KDD '03.

[25]  Terry Winograd,et al.  A language/action perspective on the design of cooperative work , 1986, CSCW '86.

[26]  Adwait Ratnaparkhi,et al.  Learning to Parse Natural Language with Maximum Entropy Models , 1999, Machine Learning.

[27]  Ian Smith,et al.  Taking email to task: the design and evaluation of a task management centered email tool , 2003, CHI '03.

[28]  Sotiris Kotsiantis,et al.  Text Classification Using Machine Learning Techniques , 2005 .

[29]  Susan R. Fussell,et al.  Coordination in Teams: Evidence from a Simulated Management Game , 2005 .

[30]  Janyce Wiebe,et al.  A Corpus Study of Evaluative and Speculative Language , 2001, SIGDIAL Workshop.

[31]  Akira Shimazu,et al.  Construction of Deliberation Structure in E‐Mail Communication , 2000, Comput. Intell..

[32]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.