论文信息 - Crossing the Conversational Chasm: A Primer on Multilingual Task-Oriented Dialogue Systems

Crossing the Conversational Chasm: A Primer on Multilingual Task-Oriented Dialogue Systems

Despite the fact that natural language conversations with machines represent one of the central objectives of AI, and despite the massive increase of research and development efforts in conversational AI, task-oriented dialogue (TOD) – i.e., conversations with an artificial agent with the aim of completing a concrete task – is currently limited to a few narrow domains (e.g., food ordering, ticket booking) and a handful of major languages (e.g., English, Chinese). In this work, we provide an extensive overview of existing efforts in multilingual TOD and analyse the factors preventing the development of truly multilingual TOD systems. We identify two main challenges that combined hinder the faster progress in multilingual TOD: (1) current state-of-the-art TOD models based on large pretrained neural language models are data hungry; at the same time (2) data acquisition for TOD use cases is expensive and tedious. Most existing approaches to multilingual TOD thus rely on (zeroor few-shot) cross-lingual transfer from resource-rich languages (in TOD, this is basically only English), either by means of (i) machine translation or (ii) multilingual representation spaces. However, such approaches are currently not a viable solution for a large number of low-resource languages without parallel data and/or limited monolingual corpora. Finally, we discuss critical challenges and potential solutions by drawing parallels between TOD and other cross-lingual and multilingual NLP research.

[1] Gertjan van Noord,et al. UDapter: Language Adaptation for Truly Universal Dependency Parsing , 2020, EMNLP.

[2] Orhan Firat,et al. Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages , 2020, NAACL.

[3] Alan W Black,et al. What Code-Switching Strategies are Effective in Dialog Systems? , 2020, SCIL.

[4] Ehud Reiter,et al. Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[5] Vladimir Vlasov,et al. DIET: Lightweight Language Understanding for Dialogue Systems , 2020, ArXiv.

[6] Gökhan Tür,et al. Language Model is all You Need: Natural Language Understanding as Question Answering , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] John A. Bateman,et al. Enabling technology for multilingual natural language generation: the KPML development environment , 1997, Natural Language Engineering.

[8] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[9] Yoshua Bengio,et al. Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus , 2016, ACL.

[10] Wei Lu,et al. Neural Architectures for Multilingual Semantic Parsing , 2017, ACL.

[11] Wanxiang Che,et al. Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog , 2020, ACL.

[12] Matthew Henderson,et al. Training Neural Response Selection for Task-Oriented Dialogue Systems , 2019, ACL.

[13] Dilek Z. Hakkani-Tür,et al. Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems , 2018, NAACL.

[14] Raffaella Bernardi,et al. Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat , 2018, NAACL.

[15] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[16] David Sankoff,et al. A Formal Grammar for Code-Switching. CENTRO Working Papers 8. , 1980 .

[17] Anna Korhonen,et al. Cross-lingual Semantic Specialization via Lexical Relation Induction , 2019, EMNLP.

[18] Xin Wang,et al. XL-NBT: A Cross-lingual Neural Belief Tracking Framework , 2018, EMNLP.

[19] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[20] Gyuwan Kim,et al. Efficient Dialogue State Tracking by Selectively Overwriting Memory , 2020, ACL.

[21] Doug Downey,et al. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[22] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2021, NAACL.

[23] Gökhan Tür,et al. The AT&T spoken language understanding system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[24] Zhoujun Li,et al. Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots , 2016, ArXiv.

[25] Giuseppe Castellucci,et al. Almawave-SLU: A New Dataset for SLU in Italian , 2019, CLiC-it.

[26] John Miller,et al. Globally Normalized Reader , 2017, EMNLP.

[27] Haoran Li,et al. MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark , 2020, EACL.

[28] Antoine Raux,et al. The Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[29] Yan Cao,et al. Adaptive Dialog Policy Learning with Hindsight and User Modeling , 2020, SIGDIAL.

[30] Anna Korhonen,et al. Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[31] Li Dong,et al. Cross-Lingual Natural Language Generation via Pre-Training , 2020, AAAI.

[32] Bowen Zhou,et al. Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling , 2016, EMNLP.

[33] Young-Bum Kim,et al. Efficient Large-Scale Domain Classification with Personalized Attention , 2018, ArXiv.

[34] Emiel Krahmer,et al. Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[35] Matej Klemen,et al. Enhancing deep neural networks with morphological information , 2020, ArXiv.

[36] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[37] Frank Keller,et al. Image Pivoting for Learning Multilingual Multimodal Representations , 2017, EMNLP.

[38] Maxine Eskénazi,et al. Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.

[39] Jie Zhou,et al. A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking , 2020, ACL.

[40] Anna Korhonen,et al. On the Relation between Linguistic Typology and (Limitations of) Multilingual Language Modeling , 2018, EMNLP.

[41] Min Zhang,et al. Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention , 2019, ACL.

[42] Heriberto Cuayáhuitl,et al. SimpleDS: A Simple Deep Reinforcement Learning Dialogue System , 2016, IWSDS.

[43] Richard Socher,et al. TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue , 2020, EMNLP.

[44] Steve Renals,et al. Multilingual training of deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45] Anders Søgaard,et al. A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..

[46] Chenguang Zhu,et al. Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue , 2019, EMNLP.

[47] Ondvrej Duvsek,et al. Neural Generation for Czech: Data and Baselines , 2019, 1910.05298.

[48] Jianfeng Gao,et al. Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems , 2018, ArXiv.

[49] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[50] Iñigo Casanueva,et al. Deep Learning for Conversational AI , 2018, NAACL.

[51] Mark Steedman,et al. Data Augmentation via Dependency Tree Morphing for Low-Resource Languages , 2018, EMNLP.

[52] Matthew Henderson,et al. ConVEx: Data-Efficient and Few-Shot Slot Labeling , 2021, NAACL.

[53] Gabriel Synnaeve,et al. Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters , 2020, INTERSPEECH.

[54] Hai Zhao,et al. Modeling Multi-turn Conversation with Deep Utterance Aggregation , 2018, COLING.

[55] Joelle Pineau,et al. Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.

[56] Zheng Zhang,et al. CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset , 2020, Transactions of the Association for Computational Linguistics.