论文信息 - Training Neural Response Selection for Task-Oriented Dialogue Systems - 字舞流文

Training Neural Response Selection for Task-Oriented Dialogue Systems

Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks. Inspired by the recent success of pretraining in language modelling, we propose an effective method for deploying response selection in task-oriented dialogue. To train response selection models for task-oriented dialogue tasks, we propose a novel method which: 1) pretrains the response selection model on large general-domain conversational corpora; and then 2) fine-tunes the pretrained model for the target dialogue domain, relying only on the small in-domain dataset to capture the nuances of the given dialogue domain. Our evaluation on six diverse application domains, ranging from e-commerce to banking, demonstrates the effectiveness of the proposed training method.

Matthew Henderson | Ivan Vulic | Daniela Gerz | Iñigo Casanueva | Pawel Budzianowski | Sam Coope | Georgios Spithourakis | Tsung-Hsien Wen | Nikola Mrksic | Pei-hao Su | Georgios P. Spithourakis | Tsung-Hsien Wen | N. Mrksic | Pei-hao Su | Matthew Henderson | Ivan Vulic | I. Casanueva | Sam Coope | D. Gerz | Paweł Budzianowski

[1] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[2] Nan Hua,et al. Universal Sentence Encoder , 2018, ArXiv.

[3] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[4] Jason Weston,et al. Retrieve and Refine: Improved Sequence Generation Models For Dialogue , 2018, SCAI@EMNLP.

[5] Philip S. Yu,et al. BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis , 2019, NAACL.

[6] Preslav Nakov,et al. SemEval-2015 Task 3: Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[7] Dongyan Zhao,et al. Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots , 2019, WSDM.

[8] David Vandyke,et al. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[9] Mengting Wan,et al. Modeling Ambiguity, Subjectivity, and Diverging Viewpoints in Opinion Question Answering Systems , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[10] Jens Lehmann,et al. Improving Response Selection in Multi-Turn Dialogue Systems by Incorporating Domain Knowledge , 2018, CoNLL.

[11] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[12] Joelle Pineau,et al. Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus , 2017, Dialogue Discourse.

[13] Keith Stevens,et al. Effective Parallel Corpus Mining using Bilingual Sentence Embeddings , 2018, WMT.

[14] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[15] Matthew Henderson,et al. Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.

[16] Gerasimos Spanakis,et al. A Retrieval-Based Dialogue System Utilizing Utterance and Context Embeddings , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[17] Matthew Henderson,et al. The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[18] Hugo Zaragoza,et al. The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[19] Jörg Tiedemann,et al. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[20] Steve Young,et al. Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[21] Yury A. Malkov,et al. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Matthew Henderson,et al. Efficient Natural Language Response Suggestion for Smart Reply , 2017, ArXiv.

[23] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[24] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25] Jan Pichl,et al. Sentence Pair Scoring: Towards Unified Framework for Text Comprehension , 2016, 1603.06127.

[26] Zhoujun Li,et al. Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots , 2016, ArXiv.

[27] Javier Snaider,et al. Conversational Contextual Cues: The Case of Personalization and History for Response Ranking , 2016, ArXiv.

[28] Cecilia Ovesdotter Alm,et al. An Analysis of Domestic Abuse Discourse on Reddit , 2015, EMNLP.

[29] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[31] Jianfeng Gao,et al. End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.

[32] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[33] Tsung-Hsien Wen,et al. Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[34] Eneko Agirre,et al. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.

[35] Jianfeng Gao,et al. Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems , 2018, ArXiv.

[36] Tsung-Hsien Wen,et al. Latent Intention Dialogue Models , 2017, ICML.

[37] Hal Daumé,et al. Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[38] David Vandyke,et al. Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[39] Kyunghyun Cho,et al. Passage Re-ranking with BERT , 2019, ArXiv.

[40] Eneko Agirre,et al. *SEM 2013 shared task: Semantic Textual Similarity , 2013, *SEMEVAL.

[41] Julien Perez,et al. Gated End-to-End Memory Networks , 2016, EACL.

[42] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[43] Hao Wang,et al. A Dataset for Research on Short-Text Conversations , 2013, EMNLP.

[44] Jimmy J. Lin,et al. Simple Applications of BERT for Ad Hoc Document Retrieval , 2019, ArXiv.

[45] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[46] Alan Ritter,et al. Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[47] Jiliang Tang,et al. A Survey on Dialogue Systems: Recent Advances and New Frontiers , 2017, SKDD.

[48] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[49] Vikram A. Saletore,et al. Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train , 2017, ArXiv.

[50] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[51] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[52] Matthew Henderson,et al. A Repository of Conversational Datasets , 2019, Proceedings of the First Workshop on NLP for Conversational AI.

[53] Hang Li,et al. An Information Retrieval Approach to Short Text Conversation , 2014, ArXiv.

[54] Dongyan Zhao,et al. An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems , 2018, IJCAI.

[55] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.

[56] Alan W. Black,et al. Data Augmentation for Neural Online Chats Response Selection , 2018, SCAI@EMNLP.

[57] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Carlo Luschi,et al. Revisiting Small Batch Training for Deep Neural Networks , 2018, ArXiv.

[59] Joelle Pineau,et al. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[60] Ivan Vulić,et al. Fully Statistical Neural Belief Tracking , 2018, ACL.

[61] Ray Kurzweil,et al. Learning Semantic Textual Similarity from Conversations , 2018, Rep4NLP@ACL.

[62] Stefan Ultes,et al. MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[63] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.

[64] Geoffrey E. Hinton,et al. Visualizing non-metric similarities in multiple maps , 2011, Machine Learning.

[65] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[66] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[67] Bing Liu,et al. Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning , 2018, NAACL.

[68] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[69] Steve J. Young,et al. Still talking to machines (cognitively speaking) , 2010, INTERSPEECH.

[70] Peter Young,et al. Smart Reply: Automated Response Suggestion for Email , 2016, KDD.

[71] Ying Chen,et al. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network , 2018, ACL.

[72] Hannes Schulz,et al. Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[73] David Vandyke,et al. Policy committee for adaptation in multi-domain spoken dialogue systems , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[74] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[75] Rui Yan,et al. Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System , 2016, SIGIR.