Conversations Powered by Cross-Lingual Knowledge

Today's open-domain conversational agents increase the informativeness of generated responses by leveraging external knowledge. Most of the existing approaches work only for scenarios with a massive amount of monolingual knowledge sources. For languages with limited availability of knowledge sources, it is not effective to use knowledge in the same language to generate informative responses. To address this problem, we propose the task of cross-lingual knowledge grounded conversation (CKGC), where we leverage large-scale knowledge sources in another language to generate informative responses. Two main challenges come with the task of cross-lingual knowledge grounded conversation: (1) knowledge selection and response generation in a cross-lingual setting; and (2) the lack of a test dataset for evaluation. To tackle the first challenge, we propose the curriculum self-knowledge distillation (CSKD) scheme, which utilizes a large-scale dialogue corpus in an auxiliary language to improve cross-lingual knowledge selection and knowledge expression in the target language via knowledge distillation. To tackle the second challenge, we collect a cross-lingual knowledge grounded conversation test dataset to facilitate relevant research in the future. Extensive experiments on the newly created dataset verify the effectiveness of our proposed curriculum self-knowledge distillation method for cross-lingual knowledge grounded conversation. In addition, we find that our proposed unsupervised method significantly outperforms the state-of-the-art baselines in cross-lingual knowledge selection.

[1]  Zheng-Yu Niu,et al.  Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation , 2020, ACL.

[2]  Wei Wu,et al.  Zero-Resource Knowledge-Grounded Dialogue Generation , 2020, NeurIPS.

[3]  Goran Glavas,et al.  Unsupervised Cross-Lingual Information Retrieval Using Monolingual Data Only , 2018, SIGIR.

[4]  Xiangnan He,et al.  Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems , 2020, WSDM.

[5]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[6]  Joelle Pineau,et al.  Extending Neural Generative Conversational Model using External Knowledge Sources , 2018, EMNLP.

[7]  Stéphane Clinchant,et al.  Domain Adaptation of Statistical Machine Translation Models with Monolingual Data for Cross Lingual Information Retrieval , 2013, ECIR.

[8]  Holger Schwenk,et al.  Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings , 2018, ACL.

[9]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[10]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[11]  Xiaoyan Zhu,et al.  Commonsense Knowledge Aware Conversation Generation with Graph Attention , 2018, IJCAI.

[12]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[13]  Rongzhong Lian,et al.  Learning to Select Knowledge for Response Generation in Dialog Systems , 2019, IJCAI.

[14]  Zhaochun Ren,et al.  Hierarchical Variational Memory Network for Dialogue Generation , 2018, WWW.

[15]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[17]  Jörg Tiedemann,et al.  OPUS-MT – Building open translation services for the World , 2020, EAMT.

[18]  Naveen Arivazhagan,et al.  Language-agnostic BERT Sentence Embedding , 2020, ArXiv.

[19]  Barnabás Póczos,et al.  Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.

[20]  M. de Rijke,et al.  Advances and Challenges in Conversational Recommender Systems: A Survey , 2021, AI Open.

[21]  Wei Wu,et al.  Knowledge-Grounded Dialogue Generation with Pre-trained Language Models , 2020, EMNLP.

[22]  Weinan Zhang,et al.  A Compare Aggregate Transformer for Understanding Document-grounded Dialogue , 2020, FINDINGS.

[23]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[24]  Yang Feng,et al.  Knowledge Diffusion for Neural Dialogue Generation , 2018, ACL.

[25]  Zhe Gan,et al.  Distilling Knowledge Learned in BERT for Text Generation , 2019, ACL.

[26]  M. de Rijke,et al.  DukeNet: A Dual Knowledge Interaction Network for Knowledge-Grounded Conversation , 2020, SIGIR.

[27]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[30]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[31]  Ting Liu,et al.  A Survey of Document Grounded Dialogue Systems (DGDS) , 2020, ArXiv.

[32]  Tiejun Zhao,et al.  Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation , 2020, ACL.

[33]  Heeyoul Choi,et al.  Self-Knowledge Distillation in Natural Language Processing , 2019, RANLP.

[34]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[35]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[36]  Christof Monz,et al.  Adaptation of Statistical Machine Translation Model for Cross-Lingual Information Retrieval in a Service Context , 2012, EACL.

[37]  Christopher D. Manning,et al.  Deep Reinforcement Learning for Mention-Ranking Coreference Models , 2016, EMNLP.

[38]  Li Dong,et al.  Cross-Lingual Natural Language Generation via Pre-Training , 2020, AAAI.

[39]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[40]  Yiming Yang,et al.  Cross-lingual Distillation for Text Classification , 2017, ACL.

[41]  Ming Gong,et al.  Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System , 2019, WSDM.

[42]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[43]  Siamak Shakeri,et al.  Knowledge Distillation in Document Retrieval , 2019, ArXiv.

[44]  Min Zhang,et al.  Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention , 2019, ACL.

[45]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[46]  Min-Yen Kan,et al.  Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.

[47]  Marjan Ghazvininejad,et al.  Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[48]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[49]  M. de Rijke,et al.  RefNet: A Reference-aware Network for Background Based Conversation , 2019, AAAI.

[50]  Maarten de Rijke,et al.  Conversations with Search Engines , 2020, ArXiv.

[51]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[52]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[53]  William Hartmann,et al.  Cross-lingual Information Retrieval with BERT , 2020, CLSSTS.

[54]  M. de Rijke,et al.  Initiative-Aware Self-Supervised Learning for Knowledge-Grounded Conversations , 2021, SIGIR.

[55]  Marie-Francine Moens,et al.  Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings , 2015, SIGIR.

[56]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[57]  Zhaochun Ren,et al.  Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation , 2018, CIKM.