Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering

Coupled with the availability of large scale datasets, deep learning architectures have enabled rapid progress on the Question Answering task. However, most of those datasets are in English, and the performances of state-of-the-art multilingual models are significantly lower when evaluated on non-English data. Due to high data collection costs, it is not realistic to obtain annotated data for each language one desires to support. We propose a method to improve the Cross-lingual Question Answering performance without requiring additional annotated data, leveraging Question Generation models to produce synthetic samples in a cross-lingual fashion. We show that the proposed method allows to significantly outperform the baselines trained on English data only. We report a new state-of-the-art on four multilingual datasets: MLQA, XQuAD, SQuAD-it and PIAF (fr).

[1]  Roberto Basili,et al.  Neural Learning for Question Answering in Italian , 2018, AI*IA.

[2]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[3]  Ming Zhou,et al.  Neural Question Generation from Text: A Preliminary Study , 2017, NLPCC.

[4]  Furu Wei,et al.  MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers , 2020, NeurIPS.

[5]  Po-Sen Huang,et al.  Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension , 2017, EMNLP.

[6]  Laurent Romary,et al.  CamemBERT: a Tasty French Language Model , 2019, ACL.

[7]  Ari Rappoport,et al.  BLEU is Not Suitable for the Evaluation of Text Simplification , 2018, EMNLP.

[8]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[9]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[10]  Sebastian Riedel,et al.  MLQA: Evaluating Cross-lingual Extractive Question Answering , 2019, ACL.

[11]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[12]  Gabriel Stanovsky,et al.  DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[13]  Ganesh Ramakrishnan,et al.  Cross-Lingual Training for Automatic Question Generation , 2019, ACL.

[14]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[17]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[18]  Ming Zhou,et al.  Question Generation for Question Answering , 2017, EMNLP.

[19]  Benjamin Piwowarski,et al.  Self-Attention Architectures for Answer-Agnostic Neural Question Generation , 2019, ACL.

[20]  Marco Guerini,et al.  Toward Stance-based Personas for Opinionated Dialogues , 2020, FINDINGS.

[21]  Geoffrey I. Webb,et al.  Generating Synthetic Time Series to Augment Sparse Datasets , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[22]  Graham Neubig,et al.  XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.

[23]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[24]  Seungyoung Lim,et al.  KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension , 2019, ArXiv.

[25]  Bonnie Webber,et al.  Querent Intent in Multi-Sentence Questions , 2020, LAW.

[26]  Mikel Artetxe,et al.  On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.

[27]  Alexandr A. Kalinin,et al.  Albumentations: fast and flexible image augmentations , 2018, Inf..

[28]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[29]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[30]  Michael Collins,et al.  Synthetic QA Corpora Generation with Roundtrip Consistency , 2019, ACL.

[31]  Li Dong,et al.  Cross-Lingual Natural Language Generation via Pre-Training , 2020, AAAI.

[32]  Jacopo Staiano,et al.  Project PIAF: Building a Native French Question-Answering Dataset , 2020, LREC.

[33]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[34]  Paul Piwek,et al.  The First Question Generation Shared Task Evaluation Challenge , 2010, Dialogue Discourse.

[35]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[36]  Robert F. Simmons,et al.  Answering English questions by computer: a survey , 1965, CACM.