Machine Reading Comprehension for Answer Re-Ranking in Customer Support Chatbots

Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents. As a result, deep neural models such as sequence-to-sequence, memory networks, and the Transformer have become key ingredients of state-of-the-art dialog systems. While those models are able to generate meaningful responses even in unseen situations, they need a lot of training data to build a reliable model. Thus, most real-world systems have used traditional approaches based on information retrieval (IR) and even hand-crafted rules, due to their robustness and effectiveness, especially for narrow-focused conversations. Here, we present a method that adapts a deep neural architecture from the domain of machine reading comprehension to re-rank the suggested answers from different models using the question as a context. We train our model using negative sampling based on question–answer pairs from the Twitter Customer Support Dataset. The experimental results show that our re-ranking framework can improve the performance in terms of word overlap and semantics both for individual models as well as for model combinations.

[1]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[6]  Wei Chu,et al.  AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine , 2017, ACL.

[7]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[8]  Jianfeng Gao,et al.  Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.

[9]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[10]  Preslav Nakov,et al.  SemEval-2017 Task 3: Community Question Answering , 2017, *SEMEVAL.

[11]  Preslav Nakov,et al.  Towards Automated Customer Support , 2018, AIMSA.

[12]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[13]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[14]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[15]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[16]  Ming Zhou,et al.  SuperAgent: A Customer Service Chatbot for E-commerce Websites , 2017, ACL.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Yang Li,et al.  Question Answering for Technical Customer Support , 2018, NLPCC.

[19]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Joelle Pineau,et al.  Hierarchical Neural Network Generative Models for Movie Dialogues , 2015, ArXiv.

[22]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[23]  Joelle Pineau,et al.  Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.

[24]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[25]  Chin-Yew Lin,et al.  Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[26]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[27]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[28]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[29]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[30]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[31]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[32]  Bowen Zhou,et al.  Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[33]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[34]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[35]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[36]  Preslav Nakov,et al.  Building Chatbots from Forum Data: Model Selection Using Question Answering Metrics , 2017, RANLP.

[37]  Joelle Pineau,et al.  A Survey of Available Corpora for Building Data-Driven Dialogue Systems , 2015, Dialogue Discourse.

[38]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[41]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[42]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[43]  Vasile Rus,et al.  A Comparison of Greedy and Optimal Assessment of Natural Language Student Input Using Word-to-Word Similarity Metrics , 2012, BEA@NAACL-HLT.

[44]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[45]  Joelle Pineau,et al.  Bootstrapping Dialog Systems with Word Embeddings , 2014 .