Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).

[1]  Jason Baldridge,et al.  Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns , 2018, TACL.

[2]  A. Waibel,et al.  Toward Robust Neural Machine Translation for Noisy Input Sequences , 2017, IWSLT.

[3]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[4]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[5]  Yang Trista Cao,et al.  Toward Gender-Inclusive Coreference Resolution , 2019, ACL.

[6]  Olivier Pietquin,et al.  Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation , 2016, NIPS 2016.

[7]  Saif Mohammad,et al.  Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[8]  Marta R. Costa-jussà,et al.  Equalizing Gender Bias in Neural Machine Translation with Word Embeddings Techniques , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[9]  C. F. Hockett A Course in Modern Linguistics , 1959 .

[10]  Andy Way,et al.  Getting Gender Right in Neural Machine Translation , 2019, EMNLP.

[11]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[12]  Matt Post,et al.  Improved speech-to-text translation with the Fisher and Callhome Spanish-English speech translation corpus , 2013, IWSLT.

[13]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14]  Ahmed Y. Tawfik,et al.  Gender aware spoken language translation applied to English-Arabic , 2018, 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP).

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[16]  Rachel Rudinger,et al.  Gender Bias in Coreference Resolution , 2018, NAACL.

[17]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[18]  Anupam Datta,et al.  Gender Bias in Neural Natural Language Processing , 2018, Logic, Language, and Security.

[19]  Mai ElSherief,et al.  Mitigating Gender Bias in Natural Language Processing: Literature Review , 2019, ACL.

[20]  Mikko Kurimo,et al.  Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension , 2010, INTERSPEECH.

[21]  Nam Soo Kim,et al.  On Measuring Gender Bias in Translation of Gender-neutral Pronouns , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[22]  Noah A. Smith,et al.  Evaluating Gender Bias in Machine Translation , 2019, ACL.

[23]  Chanwoo Kim,et al.  Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning , 2019, ArXiv.

[24]  Yiming Wang,et al.  Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.

[25]  Mathew Magimai-Doss,et al.  On Learning to Identify Genders from Raw Speech Signal Using CNNs , 2018, INTERSPEECH.

[26]  A. Hood,et al.  Gender , 2019, Textile History.

[27]  Marcello Federico,et al.  Robust Neural Machine Translation for Clean and Noisy Speech Transcripts , 2019, IWSLT.

[28]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[29]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[30]  Matteo Negri,et al.  Adapting Transformer to End-to-End Spoken Language Translation , 2019, INTERSPEECH.

[31]  Quoc V. Le,et al.  SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.

[32]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[33]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Ali Can Kocabiyikoglu,et al.  Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation , 2018, LREC.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[37]  Alex Waibel,et al.  Self-Attentional Models for Lattice Inputs , 2019, ACL.

[38]  Lucia Specia,et al.  The IWSLT 2019 Evaluation Campaign , 2019, IWSLT.

[39]  Yoav Goldberg,et al.  Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[40]  Mauro Cettolo,et al.  The IWSLT 2018 Evaluation Campaign , 2018, IWSLT.

[41]  Florian Metze,et al.  How2: A Large-scale Dataset for Multimodal Language Understanding , 2018, NIPS 2018.

[42]  Luís C. Lamb,et al.  Assessing gender bias in machine translation: a case study with Google Translate , 2018, Neural Computing and Applications.

[43]  Chiori Hori,et al.  Overview of the IWSLT 2005 Evaluation Campaign , 2005, IWSLT.

[44]  Mattia Antonino Di Gangi,et al.  MuST-C: a Multilingual Speech Translation Corpus , 2019, NAACL.

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.