A comprehensive solution to retrieval-based chatbot construction

In this paper we present the results of our experiments in training and deploying a self-supervised retrieval-based chatbot trained with contrastive learning for assisting customer support agents. In contrast to most existing research papers in this area where the focus is on solving just one component of a deployable chatbot, we present an end-to-end set of solutions to take the reader from an unlabelled chatlogs to a deployed chatbot. This set of solutions includes creating a self-supervised dataset and a weakly labelled dataset from chatlogs, as well as a systematic approach to selecting a fixed list of canned responses. We present a hierarchical-based RNN architecture for the response selection model, chosen for its ability to cache intermediate utterance embeddings, which helped to meet deployment inference speed requirements. We compare the performance of this architecture across 3 different learning objectives: self-supervised contrastive learning, binary classification, and multiclass classification. We find that using a self-supervised contrastive learning model outperforms training the binary and multi-class classification models on a weakly labelled dataset. Our results validate that the self-supervised contrastive learning approach can be effectively used for a real-world chatbot scenario.

[1]  Jens Lehmann,et al.  Improving Response Selection in Multi-Turn Dialogue Systems by Incorporating Domain Knowledge , 2018, CoNLL.

[2]  Ying Chen,et al.  Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network , 2018, ACL.

[3]  Zhoujun Li,et al.  Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots , 2016, ArXiv.

[4]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[5]  Emmanuel Morin,et al.  End-to-end response selection based on multi-level context response matching , 2020, Comput. Speech Lang..

[6]  Dongyan Zhao,et al.  Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems , 2019, EMNLP.

[7]  Chiori Hori,et al.  Overview of the seventh Dialog System Technology Challenge: DSTC7 , 2020, Comput. Speech Lang..

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Heuiseok Lim,et al.  An Effective Domain Adaptive Post-Training Method for BERT in Response Selection , 2019, INTERSPEECH.

[10]  Xuan Liu,et al.  Multi-view Response Selection for Human-Computer Conversation , 2016, EMNLP.

[11]  Dongyan Zhao,et al.  Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots , 2019, WSDM.

[12]  Zhoujun Li,et al.  Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots , 2018, ACL.

[13]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hai Zhao,et al.  Modeling Multi-turn Conversation with Deep Utterance Aggregation , 2018, COLING.

[16]  Qian Chen,et al.  Sequential Neural Networks for Noetic End-to-End Response Selection , 2020, Comput. Speech Lang..

[17]  Chunyuan Yuan,et al.  Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots , 2019, EMNLP.

[18]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.