A Large-Scale Corpus of E-mail Conversations with Standard and Two-Level Dialogue Act Annotations

We present a large-scale corpus of e-mail conversations with domain-agnostic and two-level dialogue act (DA) annotations towards the goal of a better understanding of asynchronous conversations. We annotate over 6,000 messages and 35,000 sentences from more than 2,000 threads. For a domain-independent and application-independent DA annotations, we choose ISO standard 24617-2 as the annotation scheme. To assess the difficulty of DA recognition on our corpus, we evaluate several models, including a pre-trained contextual representation model, as our baselines. The experimental results show that BERT outperforms other neural network models, including previous state-of-the-art models, but falls short of a human performance. We also demonstrate that DA tags of two-level granularity enable a DA recognition model to learn efficiently by using multi-task learning. An evaluation of a model trained on our corpus against other domains of asynchronous conversation reveals the domain independence of our DA annotations.

[1]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Mona T. Diab,et al.  Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data , 2019, EMNLP.

[3]  Shafiq R. Joty,et al.  Speech Act Modeling of Written Asynchronous Conversations with Task-Specific Embeddings and Conditional Structured Models , 2016, ACL.

[4]  G. Carenini,et al.  A Publicly Available Annotated Corpus for Supervised Email Summarization , 2008 .

[5]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[6]  Jihie Kim,et al.  Learning to Detect Conversation Focus of Threaded Discussions , 2006, NAACL.

[7]  Mark G. Core,et al.  Coding Dialogs with the DAMSL Annotation Scheme , 1997 .

[8]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[9]  Elizabeth Shriberg,et al.  Meeting Recorder Project: Dialog Act Labeling Guide , 2004 .

[10]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[11]  Prasenjit Mitra,et al.  Summarizing Online Forum Discussions – Can Dialog Acts of Individual Messages Help? , 2014, EMNLP.

[12]  Shay B. Cohen,et al.  Conversation Trees: A Grammar Model for Topic Structure in Forums , 2015, EMNLP.

[13]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[14]  Li Wang,et al.  Tagging and Linking Web Forum Posts , 2010, CoNLL.

[15]  Amy X. Zhang,et al.  Characterizing Online Discussion Using Coarse Discourse Sequences , 2017, Proceedings of the International AAAI Conference on Web and Social Media.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[18]  Jihie Kim,et al.  Profiling Student Interactions in Threaded Discussions with Speech Act Classifiers , 2007, AIED.

[19]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[20]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[21]  Gary Geunbae Lee,et al.  Semi-supervised Speech Act Recognition in Emails and Forums , 2009, EMNLP.

[22]  Shafiq R. Joty,et al.  Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach , 2018, CL.

[23]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[24]  Brian D. Davison,et al.  A classification-based approach to question answering in discussion boards , 2009, SIGIR.

[25]  Shafiq R. Joty,et al.  Adaptation of Hierarchical Structured Models for Speech Act Recognition in Asynchronous Conversation , 2019, NAACL.

[26]  E. Schegloff,et al.  Opening up Closings , 1973 .

[27]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[28]  Prasenjit Mitra,et al.  Classifying User Messages For Managing Web Forum Data , 2012, WebDB.

[29]  Giuseppe Carenini,et al.  Extractive Summarization and Dialogue Act Modeling on Email Threads: An Integrated Probabilistic Approach , 2014, SIGDIAL Conference.

[30]  Shafiq R. Joty,et al.  Topic Segmentation and Labeling in Asynchronous Conversations , 2013, J. Artif. Intell. Res..