XPersona: Evaluating Multilingual Personalized Chatbot

Personalized dialogue systems are an essential step toward better human-machine interaction. Existing personalized dialogue agents rely on properly designed conversational datasets, which are mostly monolingual (e.g., English), which greatly limits the usage of conversational agents in other languages. In this paper, we propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for building and evaluating multilingual personalized agents. We experiment with both multilingual and cross-lingual trained baselines, and evaluate them against monolingual and translation-pipeline models using both automatic and human evaluation. Experimental results show that the multilingual trained models outperform the translation-pipeline and that they are on par with the monolingual models, with the advantage of having a single model across multiple languages. On the other hand, the state-of-the-art cross-lingual trained models achieve inferior performance to the other models, showing that cross-lingual conversation modeling is a challenging task. We hope that our dataset and baselines will accelerate research in multilingual dialogue systems.

[1]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[2]  Pascale Fung,et al.  Personalizing Dialogue Agents via Meta-Learning , 2019, ACL.

[3]  Sebastian Schuster,et al.  Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog , 2018, NAACL.

[4]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5]  Jason Weston,et al.  Learning from Dialogue after Deployment: Feed Yourself, Chatbot! , 2019, ACL.

[6]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[7]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[8]  Boi Faltings,et al.  Personalization in Goal-Oriented Dialog , 2017, ArXiv.

[9]  Regina Barzilay,et al.  Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing , 2019, NAACL.

[10]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11]  Yunyao Li,et al.  Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling , 2015, ACL.

[12]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[13]  Regina Barzilay,et al.  Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings , 2016, NAACL.

[14]  Jason Weston,et al.  What makes a good conversation? How controllable attributes affect human judgments , 2019, NAACL.

[15]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[16]  Laurent Romary,et al.  CamemBERT: a Tasty French Language Model , 2019, ACL.

[17]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[18]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[19]  Jason Weston,et al.  Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2019 .

[20]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[21]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[22]  Dan Roth,et al.  Cross-Lingual Ability of Multilingual BERT: An Empirical Study , 2019, ICLR.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[25]  Li Dong,et al.  Cross-Lingual Natural Language Generation via Pre-Training , 2020, AAAI.

[26]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[27]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[28]  Peng Xu,et al.  Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables , 2019, EMNLP.

[29]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[30]  Jason Weston,et al.  Engaging Image Chat: Modeling Personality in Grounded Dialogue , 2018, ArXiv.

[31]  Fei Sha,et al.  Aiming to Know You Better Perhaps Makes Me a More Engaging Dialogue Partner , 2018, CoNLL.

[32]  Heng Ji,et al.  Cross-lingual Name Tagging and Linking for 282 Languages , 2017, ACL.

[33]  Sebastian Riedel,et al.  MLQA: Evaluating Cross-lingual Extractive Question Answering , 2019, ACL.

[34]  Jason Weston,et al.  ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons , 2019, ArXiv.

[35]  Wanxiang Che,et al.  Pre-Training with Whole Word Masking for Chinese BERT , 2019, ArXiv.

[36]  Julia Hirschberg,et al.  Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task , 2018, CodeSwitch@ACL.

[37]  Jaime G. Carbonell,et al.  Neural Cross-Lingual Named Entity Recognition with Minimal Resources , 2018, EMNLP.

[38]  Xin Wang,et al.  XL-NBT: A Cross-lingual Neural Belief Tracking Framework , 2018, EMNLP.

[39]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[40]  Maosong Sun,et al.  XQA: A Cross-lingual Open-domain Question Answering Dataset , 2019, ACL.

[41]  Dilek Z. Hakkani-Tür,et al.  DeepCopy: Grounded Response Generation with Hierarchical Pointer Networks , 2019, SIGdial.

[42]  Pascale Fung,et al.  Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences , 2019, CoNLL.

[43]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[44]  Peng Xu,et al.  Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems , 2019, AAAI.

[45]  Colin Raffel,et al.  mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2021, NAACL.

[46]  Seungwhan Moon,et al.  OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs , 2019, ACL.

[47]  Fang Deng,et al.  End-to-End Code-Switching ASR for Low-Resourced Language Pairs , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[48]  Satoshi Nakamura,et al.  Zero-Shot Code-Switching ASR and TTS with Multilingual Machine Speech Chain , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[49]  Rahul Goel,et al.  On Evaluating and Comparing Open Domain Dialog Systems , 2018 .

[50]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[51]  Steve Young,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[52]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[53]  Shashi Narayan,et al.  Leveraging Pre-trained Checkpoints for Sequence Generation Tasks , 2019, Transactions of the Association for Computational Linguistics.

[54]  Rahul Goel,et al.  On Evaluating and Comparing Conversational Agents , 2018, ArXiv.

[55]  Nanyun Peng,et al.  On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing , 2018, NAACL.

[56]  François Yvon,et al.  Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning , 2014, EMNLP.

[57]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[58]  Jason Weston,et al.  ELI5: Long Form Question Answering , 2019, ACL.

[59]  Natasha Jaques,et al.  Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems , 2019, NeurIPS.

[60]  Young-Bum Kim,et al.  Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources , 2017, EMNLP.

[61]  Dilek Z. Hakkani-Tür,et al.  Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations , 2019, INTERSPEECH.

[62]  Di He,et al.  Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.

[63]  Joelle Pineau,et al.  Generative Deep Neural Networks for Dialogue: A Short Review , 2016, ArXiv.

[64]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[65]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[66]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[67]  Jason Weston,et al.  Importance of a Search Strategy in Neural Dialogue Modelling , 2018, ArXiv.

[68]  Pascale Fung,et al.  Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition , 2019, RepL4NLP@ACL.

[69]  Marjan Ghazvininejad,et al.  Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[70]  Tara N. Sainath,et al.  Multilingual Speech Recognition with a Single End-to-End Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[71]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[72]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[73]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[74]  Jian Ni,et al.  Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection , 2017, ACL.

[75]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.