Characterizing and Supporting Question Answering in Human-to-Human Communication

Email continues to be one of the most important means of online communication. People spend a significant amount of time sending, reading, searching and responding to email in order to manage tasks, exchange information, etc. In this paper, we focus on information exchange over enterprise email in the form of questions and answers. We study a large scale publicly available email dataset to characterize information exchange via questions and answers in enterprise email. We augment our analysis with a survey to gain insights on the types of questions exchanged, when and how do people get back to them and whether this behavior is adequately supported by existing email management and search functionality. We leverage this understanding to define the task of extracting question/answer pairs from threaded email conversations. We propose a neural network based approach that matches the question to the answer considering comparisons at different levels of granularity. We also show that we can improve the performance by leveraging external data of question and answer pairs. We test our approach using a manually labeled email data collected using a crowd-sourcing annotation study. Our findings have implications for designing email clients and intelligent agents that support question answering and information lookup in email.

[1]  Kai Wang,et al.  Exploiting Salient Patterns for Question Detection and Question Retrieval in Community-based Question Answering , 2010, COLING.

[2]  Yoelle Maarek,et al.  How Many Folders Do You Really Need?: Classifying Email into a Handful of Categories , 2014, CIKM.

[3]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[4]  Shuohang Wang,et al.  A Compare-Aggregate Model for Matching Text Sequences , 2016, ICLR.

[5]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[7]  Yoelle Maarek,et al.  How Many Folders Do You Really Need?: Classifying Email into a Handful of Categories , 2014, CIKM.

[8]  Michael Stubbs,et al.  Words and Phrases: Corpus Studies of Lexical Semantics , 2001 .

[9]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[10]  Michael Gamon,et al.  Actionable Email Intent Modeling with Reparametrized RNNs , 2017, AAAI.

[11]  Tom M. Mitchell,et al.  Learning to Classify Email into “Speech Acts” , 2004, EMNLP.

[12]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[13]  Dotan Di Castro,et al.  You've got Mail, and Here is What you Could do With It!: Analyzing and Predicting Actions on Email Messages , 2016, WSDM.

[14]  Cícero Nogueira dos Santos,et al.  Learning Hybrid Representations to Retrieve Semantically Equivalent Questions , 2015, ACL.

[15]  Preslav Nakov,et al.  SemEval-2017 Task 3: Community Question Answering , 2017, *SEMEVAL.

[16]  Zhiwei Sun,et al.  Question/Answer Matching for CQA System via Combining Lexical and Sequential Information , 2015, AAAI.

[17]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[18]  Raghavendra Udupa,et al.  InLook: Revisiting Email Search Experience , 2016, SIGIR.

[19]  Cécile Paris,et al.  Detecting Emails Containing Requests for Action , 2010, NAACL.

[20]  Jeffrey Pomerantz A linguistic analysis of question taxonomies: Research Articles , 2005 .

[21]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[22]  Peter Young,et al.  Smart Reply: Automated Response Suggestion for Email , 2016, KDD.

[23]  W. Bruce Croft,et al.  Retrieval models for question and answer archives , 2008, SIGIR '08.

[24]  Susan T. Dumais,et al.  Stuff I've Seen: A System for Personal Information Retrieval and Re-Use , 2003, SIGF.

[25]  Hang Li,et al.  A Deep Architecture for Matching Short Texts , 2013, NIPS.

[26]  Susan T. Dumais,et al.  Characterizing and Predicting Enterprise Email Reply Behavior , 2017, SIGIR.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  John C. Tang,et al.  Am I wasting my time organizing email?: a study of email refinding , 2011, CHI.

[29]  Robert E. Kraut,et al.  Email overload at work: an analysis of factors associated with email strain , 2006, IEEE Engineering Management Review.

[30]  Jeffrey Pomerantz,et al.  A linguistic analysis of question taxonomies , 2005, J. Assoc. Inf. Sci. Technol..

[31]  Milad Shokouhi,et al.  Finding Email in a Multi-Account, Multi-Device World , 2016, CHI.

[32]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[33]  Joelle Pineau,et al.  Hierarchical Neural Network Generative Models for Movie Dialogues , 2015, ArXiv.

[34]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[35]  Kristina Lerman,et al.  Evolution of Conversations in the Age of Email Overload , 2015, WWW.

[36]  Stephen E. Robertson,et al.  Microsoft Cambridge at TREC 14: Enterprise Track , 2005, TREC.

[37]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[38]  Candace L. Sidner,et al.  Email overload: exploring personal information management of email , 1996, CHI.

[39]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[40]  Matthew Henderson,et al.  Efficient Natural Language Response Suggestion for Smart Reply , 2017, ArXiv.

[41]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[42]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[43]  Li Cai,et al.  Learning the Latent Topics for Question Retrieval in Community QA , 2011, IJCNLP.

[44]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[45]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[46]  Paul N. Bennett,et al.  Detecting action-items in e-mail , 2005, SIGIR '05.

[47]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[48]  Wessel Kraaij,et al.  Assessing e-mail intent and tasks in e-mail messages , 2016, Inf. Sci..

[49]  Yonatan Belinkov,et al.  SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering , 2016, *SEMEVAL.

[50]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[51]  James P. Callan,et al.  Experiments with Language Models for Known-Item Finding of E-mail Messages , 2005, TREC.

[52]  Tingting He,et al.  Learning semantic representation with neural networks for community question answering retrieval , 2016, Knowl. Based Syst..

[53]  A. J. Bernheim Brush,et al.  Revisiting Whittaker & Sidner's "email overload" ten years later , 2006, CSCW '06.

[54]  Joemon M. Jose,et al.  A Semantic Graph based Topic Model for Question Retrieval in Community Question Answering , 2016, WSDM.

[55]  Robert E. Kraut,et al.  Understanding email use: predicting action on a message , 2005, CHI.

[56]  Preslav Nakov,et al.  It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering , 2016, EMNLP.

[57]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[58]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.