论文信息 - Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks

Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks

Email communication plays an integral part of everybody’s life nowadays. Especially for business emails, extracting and analysing these communication networks can reveal interesting patterns of processes and decision making within a company. Fraud detection is another application area where precise detection of communication networks is essential. In this paper we present an approach based on recurrent neural networks to untangle email threads originating from forward and reply behaviour. We further classify parts of emails into 2 or 5 zones to capture not only header and body information but also greetings and signatures. We show that our deep learning approach outperforms state-of-the-art systems based on traditional machine learning and hand-crafted rules. Besides using the well-known Enron email corpus for our experiments, we additionally created a new annotated email benchmark corpus from Apache mailing lists.

Ralf Krestel | Tim Repke

[1] Susan T. Dumais,et al. Characterizing and Predicting Enterprise Email Reply Behavior , 2017, SIGIR.

[2] Siegfried Handschuh,et al. Classifying Action Items for Semantic Email , 2010, LREC.

[3] Carolyn Penstein Rosé,et al. Recovering Implicit Thread Structure in Newsgroup Style Conversations , 2021, ICWSM.

[4] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.

[5] Aristides Gionis,et al. Social Network Analysis and Mining for Business Applications , 2011, TIST.

[6] Dominique Estival,et al. Author Profiling for English and Arabic Emails , 2008 .

[7] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[8] Iryna Gurevych,et al. Headerless, Quoteless, but not Hopeless? Using Pairwise Email Classification to Disentangle Email Threads , 2013, RANLP.

[9] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[10] Ben Shneiderman,et al. Beyond Threads: Identifying Discussions in Email Archives , 2005 .

[11] Nada Matta,et al. Context Aware Knowledge Zoning: Traceability and Business Emails , 2015, AI4KM@IJCAI.