BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

Task-oriented dialogue (ToD) benchmarks provide an important avenue to measure progress and develop better conversational agents. However, existing datasets for end-to-end ToD modeling are limited to a single language, hindering the development of robust end-to-end ToD systems for multilingual countries and regions. Here we introduce BiToD2, the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling. BiToD contains over 7k multi-domain dialogues (144k utterances) with a large and realistic bilingual knowledge base. It serves as an effective benchmark for evaluating bilingual ToD systems and crosslingual transfer learning approaches. We provide state-of-the-art baselines under three evaluation settings (monolingual, bilingual, and cross-lingual). The analysis of our baselines in different settings highlights 1) the effectiveness of training a bilingual ToD system compared to two independent monolingual ToD systems, and 2) the potential of leveraging a bilingual knowledge base and cross-lingual transfer learning to improve the system performance under low resource conditions.

[1]  Sonal Gupta,et al.  Semantic Parsing for Task Oriented Dialog using Hierarchical Representations , 2018, EMNLP.

[2]  Richard Socher,et al.  Global-to-local Memory Pointer Networks for Task-Oriented Dialogue , 2019, ICLR.

[3]  Pascale Fung,et al.  Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems , 2018, ACL.

[4]  Peng Xu,et al.  Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems , 2019, AAAI.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Jonathan Ginzburg,et al.  Disfluencies as intra-utterance dialogue moves , 2014 .

[7]  Maddalen Lopez de Lacalle,et al.  Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque , 2020, LREC.

[8]  Etsuko Ishii,et al.  XPersona: Evaluating Multilingual Personalized Chatbot , 2020, NLP4CONVAI.

[9]  Holger Schwenk,et al.  Beyond English-Centric Multilingual Machine Translation , 2020, J. Mach. Learn. Res..

[10]  Qian Cao,et al.  RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling , 2020, EMNLP.

[11]  Gökhan Tür,et al.  (Almost) Zero-Shot Cross-Lingual Spoken Language Understanding , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[13]  Christopher D. Manning,et al.  A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.

[14]  Christine Howes,et al.  Coordinating in dialogue: Using compound contributions to join a party , 2012 .

[15]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[16]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[17]  Eleni Gregoromichelaki,et al.  Joint Utterances and the (Split-)Turn Taking Puzzle , 2016 .

[18]  Mitesh M. Khapra,et al.  Graph Convolutional Network with Sequential Attention for Goal-Oriented Dialogue Systems , 2019, Transactions of the Association for Computational Linguistics.

[19]  Zachary C. Lipton,et al.  Entity Projection via Machine Translation for Cross-Lingual NER , 2019, EMNLP.

[20]  Raquel. FernaÌndez Rovira Non-sentential utterances in dialogue : classification, resolution and use , 2006 .

[21]  Paul A. Crook,et al.  Situated and Interactive Multimodal Conversations , 2020, COLING.

[22]  Marjan Ghazvininejad,et al.  Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[23]  Pascale Fung,et al.  MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems , 2020, EMNLP.

[24]  Yangming Li,et al.  Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever , 2019, EMNLP.

[25]  Jianfeng Gao,et al.  SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model , 2020, ArXiv.

[26]  Goran Glavas,et al.  Crossing the Conversational Chasm: A Primer on Multilingual Task-Oriented Dialogue Systems , 2021, ArXiv.

[27]  Danish Contractor,et al.  2019 Formatting Instructions for Authors Using LaTeX , 2018 .

[28]  Bill Byrne,et al.  TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems , 2020, ACL.

[29]  Shikib Mehri,et al.  STAR: A Schema-Guided Dialog Dataset for Transfer Learning , 2020, ArXiv.

[30]  Alborz Geramifard,et al.  SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations , 2021, EMNLP.

[31]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[32]  Xifeng Yan,et al.  Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning , 2019, ArXiv.

[33]  Richard Socher,et al.  A Simple Language Model for Task-Oriented Dialogue , 2020, NeurIPS.

[34]  Christopher D. Manning,et al.  Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[35]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[36]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[37]  Zhou Yu,et al.  Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems , 2021, NAACL.

[38]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[39]  Haoran Li,et al.  MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark , 2020, EACL.

[40]  Seiichi Nakagawa,et al.  A Robust Dialogue System with Spontaneous Speech Understanding and Cooperative Response , 1997, Real Applications@ACL/EACL.

[41]  Gökhan Tür,et al.  Building a Conversational Agent Overnight with Dialogue Self-Play , 2018, ArXiv.

[42]  Raghav Gupta,et al.  Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset , 2020, AAAI.

[43]  Zheng Zhang,et al.  CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset , 2020, Transactions of the Association for Computational Linguistics.

[44]  Sebastian Schuster,et al.  Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog , 2018, NAACL.

[45]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[46]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[47]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[48]  Bill Byrne,et al.  Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset , 2019, EMNLP.

[49]  Wanxiang Che,et al.  Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog , 2020, ACL.

[50]  David Vandyke,et al.  Conditional Generation and Snapshot Learning in Neural Dialogue Systems , 2016, EMNLP.

[51]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[52]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.