XL-NBT: A Cross-lingual Neural Belief Tracking Framework

Task-oriented dialog systems are becoming pervasive, and many companies heavily rely on them to complement human agents for customer service in call centers. With globalization, the need for providing cross-lingual customer support becomes more urgent than ever. However, cross-lingual support poses great challenges---it requires a large amount of additional annotated data from native speakers. In order to bypass the expensive human annotation and achieve the first step towards the ultimate goal of building a universal dialog system, we set out to build a cross-lingual state tracking framework. Specifically, we assume that there exists a source language with dialog belief tracking annotations while the target languages have no annotated dialog data of any form. Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data. We then distill and transfer its own knowledge to the student state tracker in target languages. We specifically discuss two types of common parallel resources: bilingual corpus and bilingual dictionary, and design different transfer learning strategies accordingly. Experimentally, we successfully use English state tracker as the teacher to transfer its knowledge to both Italian and German trackers and achieve promising results.

[1]  Antoine Raux,et al.  The Dialog State Tracking Challenge Series: A Review , 2016, Dialogue Discourse.

[2]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[3]  Alexander M. Rush,et al.  Latent Alignment and Variational Attention , 2018, NeurIPS.

[4]  Young-Bum Kim,et al.  Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources , 2017, EMNLP.

[5]  Matthew Henderson,et al.  Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.

[6]  Dilek Z. Hakkani-Tür,et al.  Scalable multi-domain dialogue state tracking , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[7]  Jonathan Pool,et al.  PanLex: Building a Resource for Panlingual Lexical Translation , 2014, LREC.

[8]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[9]  Nobuyuki Shimizu,et al.  Cross-Lingual Image Caption Generation , 2016, ACL.

[10]  David Vandyke,et al.  Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[11]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[12]  Angeliki Metallinou,et al.  Discriminative state tracking for spoken dialog systems , 2013, ACL.

[13]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[14]  Regina Barzilay,et al.  Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings , 2016, NAACL.

[15]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[16]  Geoffrey Zweig,et al.  Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Qi Hu,et al.  An End-to-end Approach for Handling Unknown Slot Values in Dialogue State Tracking , 2018, ACL.

[19]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[20]  Jianfeng Gao,et al.  End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.

[21]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[22]  Jian Ni,et al.  Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection , 2017, ACL.

[23]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[24]  Matthew Henderson,et al.  Robust dialog state tracking using delexicalised recurrent neural networks and unsupervised adaptation , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[25]  Heng Ji,et al.  Cross-lingual Name Tagging and Linking for 282 Languages , 2017, ACL.

[26]  Steve Young,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[27]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[28]  Joelle Pineau,et al.  An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[29]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.