On the collective classification of email "speech acts"

We consider classification of email messages as to whether or not they contain certain "email acts", such as a request or a commitment. We show that exploiting the sequential correlation among email messages in the same thread can improve email-act classification. More specifically, we describe a new text-classification algorithm based on a dependency-network based collective classification method, in which the local classifiers are maximum entropy models based on words and certain relational features. We show that statistically significant improvements over a bag-of-words baseline classifier can be obtained for some, but not all, email-act classes. Performance improvements obtained by collective classification appears to be consistent across many email acts suggested by prior speech-act theory.

[1]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Terry Winograd,et al.  Understanding computers and cognition , 1986 .

[3]  G. B. Smith,et al.  Preface to S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images” , 1987 .

[4]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[5]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[6]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[7]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[8]  Akira Shimazu,et al.  Construction of Deliberation Structure in E‐Mail Communication , 2000, Comput. Intell..

[9]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[10]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[11]  Jennifer Neville,et al.  Statistical Relational Learning: Four Claims and a Survey , 2003 .

[12]  Jennifer Neville,et al.  Why collective inference improves relational classification , 2004, KDD.

[13]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[14]  Anton Leuski Email is a stage: discovering people roles from email archives , 2004, SIGIR '04.

[15]  Susan R. Fussell,et al.  Coordination in Teams: Evidence from a Simulated Management Game , 2005 .

[16]  M. Schoop A Language-Action Approach to Electronic Negotiations , 2005 .