The IWSLT 2019 Evaluation Campaign

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech. For the first two tasks we encouraged submissions of endto-end speech-to-text systems, and for the second task participants could also use the video as additional input. We received submissions by 12 research teams. This overview provides detailed descriptions of the data and evaluation conditions of each task and reports results of the participating systems.

[1]  Hermann Ney,et al.  CharacTer: Translation Edit Rate on Character Level , 2016, WMT.

[2]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[3]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[4]  Fabienne Braune,et al.  Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora , 2010, COLING.

[5]  Marta R. Costa-jussà,et al.  End-to-End Speech Translation with the Transformer , 2018, IberSPEECH.

[6]  Satoshi Nakamura,et al.  Multimodal interaction data between clinical psychologists and students for attentive listening modeling , 2016, 2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA).

[7]  Sebastian Stüker,et al.  A Corpus of Spontaneous Speech in Lectures: The KIT Lecture Corpus for Spoken Language Processing and Translation , 2014, LREC.

[8]  Florian Metze,et al.  How2: A Large-scale Dataset for Multimodal Language Understanding , 2018, NIPS 2018.

[9]  Fethi Bougares,et al.  NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems , 2017, Prague Bull. Math. Linguistics.

[10]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[11]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[12]  Paul Deléglise,et al.  Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.

[13]  Noriko Kando,et al.  Overview of the IWSLT04 evaluation campaign , 2004, IWSLT.

[14]  Matteo Negri,et al.  Adapting Transformer to End-to-End Spoken Language Translation , 2019, INTERSPEECH.

[15]  Matthias Sperber,et al.  Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation , 2019, TACL.

[16]  Khalil Sima'an,et al.  BEER: BEtter Evaluation as Ranking , 2014, WMT@ACL.

[17]  Quoc V. Le,et al.  SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Navdeep Jaitly,et al.  Sequence-to-Sequence Models Can Directly Translate Foreign Speech , 2017, INTERSPEECH.

[20]  Alexander G. Hauptmann,et al.  Instructional Videos for Unsupervised Harvesting and Learning of Action Examples , 2014, ACM Multimedia.

[21]  Mattia Antonino Di Gangi,et al.  MuST-C: a Multilingual Speech Translation Corpus , 2019, NAACL.

[22]  Tiejun Zhao,et al.  Construct Trilingual Parallel Corpus on Demand , 2006, ISCSLP.

[23]  Christian Federmann,et al.  Microsoft Speech Language Translation (MSLT) Corpus: The IWSLT 2016 release for English, French and German , 2016, IWSLT.

[24]  Marcello Federico,et al.  Complexity of spoken versus written language for machine translation , 2014, EAMT.