MAILEX: Email Event and Argument Extraction

In this work, we present the first dataset, \dataset, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes $\sim$4K emails annotated with $\sim$9K event instances. To understand the task challenges, we conducted a series of experiments comparing two commonly-seen lines of approaches for event extraction, i.e., sequence labeling and generative end-to-end extraction (including few-shot GPT-3.5). Our results showed that the task of email event extraction is far from being addressed, due to challenges lying in, e.g., extracting non-continuous, shared trigger spans, extracting non-named entity arguments, and modeling the email conversational history. Our work thus suggests more investigations in this domain-specific event extraction task in the future.\footnote{The source code and dataset can be obtained from \url{https://github.com/salokr/Email-Event-Extraction}.

[1]  Huan Zhao,et al.  Exploring the Feasibility of ChatGPT for Event Extraction , 2023, ArXiv.

[2]  Yijun Mo,et al.  Bi-Directional Iterative Prompt-Tuning for Event Argument Extraction , 2022, EMNLP.

[3]  Ido Dagan,et al.  Cross-document Event Coreference Search: Task, Dataset and Modeling , 2022, EMNLP.

[4]  Byron C. Wallace,et al.  PHEE: A Dataset for Pharmacovigilance Event Extraction from Text , 2022, EMNLP.

[5]  Heng Ji,et al.  Dynamic Global Memory for Document-level Argument Extraction , 2022, ACL.

[6]  Soham Deshmukh,et al.  Adapting Task-Oriented Dialogue Models for Email Conversations , 2022, ArXiv.

[7]  Yixin Cao,et al.  Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction , 2022, ACL.

[8]  Hongbo Xu,et al.  CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction , 2021, FINDINGS.

[9]  Meng Liao,et al.  Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction , 2021, ACL.

[10]  Jiawei Han,et al.  Document-Level Event Argument Extraction by Conditional Generation , 2021, NAACL.

[11]  Jiancheng Lv,et al.  Key Factors of Email Subject Generation , 2020, ICONIP.

[12]  Jian Liu,et al.  Event Extraction as Machine Reading Comprehension , 2020, EMNLP.

[13]  Claire Cardie,et al.  Event Extraction by Answering (Almost) Natural Questions , 2020, EMNLP.

[14]  Jie Zhou,et al.  MAVEN: A Massive General Domain Event Detection Dataset , 2020, EMNLP.

[15]  Ryen W. White,et al.  Smart To-Do: Automatic Generation of To-Do Items from Emails , 2020, ACL.

[16]  Benjamin Van Durme,et al.  Multi-Sentence Argument Linking , 2019, ACL.

[17]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[18]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[19]  Paul N. Bennett,et al.  Context-Aware Intent Identification in Email Conversations , 2019, SIGIR.

[20]  Andrew M. Dai,et al.  Gmail Smart Compose: Real-Time Assisted Writing , 2019, KDD.

[21]  Ryen W. White,et al.  Domain Adaptation for Commitment Detection in Email , 2019, WSDM.

[22]  Thien Huu Nguyen,et al.  One for All: Neural Joint Modeling of Entities and Events , 2018, AAAI.

[23]  Michael Gamon,et al.  Actionable Email Intent Modeling with Reparametrized RNNs , 2017, AAAI.

[24]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[25]  Peter Young,et al.  Smart Reply: Automated Response Suggestion for Email , 2016, KDD.

[26]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[27]  Seth Kulick,et al.  From Light to Rich ERE: Annotation of Entities, Relations, and Events , 2015, EVENTS@HLP-NAACL.

[28]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[29]  Cécile Paris,et al.  Detecting Emails Containing Requests for Action , 2010, NAACL.

[30]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[31]  Takuya Nakamura,et al.  A Risk Assessment System with Automatic Extraction of Event Types , 2008, Intelligent Information Processing.

[32]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[33]  Peter D. Turney Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.

[34]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[35]  Diyi Yang,et al.  Focus on the Action: Learning to Highlight and Summarize Jointly for Email To-Do Items Summarization , 2022, Findings.

[36]  Pengfei Yu,et al.  Building an Event Extractor with Only a Few Examples , 2022, DEEPLO.

[37]  G. Carenini,et al.  A Publicly Available Annotated Corpus for Supervised Email Summarization , 2008 .

[38]  蓝色海岸 事情再多也不忘 Google Calendar , 2006 .

[39]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[40]  Ralph Grishman,et al.  Information Extraction: Techniques and Challenges , 1997, SCIE.