ELITR Multilingual Live Subtitling: Demo and Strategy

This paper presents an automatic speech translation system aimed at live subtitling of conference presentations. We describe the overall architecture and key processing components. More importantly, we explain our strategy for building a complex system for end-users from numerous individual components, each of which has been tested only in laboratory conditions. The system is a working prototype that is routinely tested in recognizing English, Czech, and German speech and presenting it translated simultaneously into 42 target languages.

[1]  Barry Haddow,et al.  SLTEV: Comprehensive Evaluation of Spoken Language Translation , 2021, EACL.

[2]  Rico Sennrich,et al.  ELITR: European Live Translator , 2020, EAMT.

[3]  Kenneth Ward Church,et al.  Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training , 2020, FINDINGS.

[4]  Ondrej Bojar,et al.  Presenting Simultaneous Translation in Limited Space , 2020, ITAT.

[5]  Nadir Durrani,et al.  FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN , 2020, IWSLT.

[6]  Ondvrej Bojar,et al.  ELITR Non-Native Speech Translation at IWSLT 2020 , 2020, IWSLT.

[7]  Rico Sennrich,et al.  Removing European Language Barriers with Innovative Machine Translation Technology , 2020, IWLTP.

[8]  Ondřej Bojar,et al.  Large Corpus of Czech Parliament Plenary Hearings , 2020, LREC.

[9]  Rico Sennrich,et al.  Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation , 2020, ACL.

[10]  Naveen Arivazhagan,et al.  Re-translation versus Streaming for Simultaneous Translation , 2020, IWSLT.

[11]  High Performance Sequence-to-Sequence Model for Streaming Speech Recognition , 2020, INTERSPEECH.

[12]  Renjie Zheng,et al.  Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation , 2019, EMNLP.

[13]  Haifeng Wang,et al.  DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting , 2019, ArXiv.

[14]  Ondrej Bojar,et al.  English-Czech Systems in WMT19: Document-Level Transformer , 2019, WMT.

[15]  Wei Li,et al.  Monotonic Infinite Lookback Attention for Simultaneous Machine Translation , 2019, ACL.

[16]  Haifeng Wang,et al.  STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework , 2018, ACL.

[17]  Samsung and University of Edinburgh’s System for the IWSLT 2019 , 2019, IWSLT.

[18]  Noah A. Smith,et al.  You May Not Need Attention , 2018, ArXiv.

[19]  Alex Waibel,et al.  Open Source Toolkit for Speech to Text Translation , 2018, Prague Bull. Math. Linguistics.

[20]  Elizabeth Salesky,et al.  KIT Lecture Translator: Multilingual Speech Translation with One-Shot Learning , 2018, COLING.

[21]  Matthias Sperber,et al.  Low-Latency Neural Speech Translation , 2018, INTERSPEECH.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[24]  Graham Neubig,et al.  Learning to Translate in Real-time with Neural Machine Translation , 2016, EACL.

[25]  Tanel Alumäe,et al.  Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration , 2016, INTERSPEECH.

[26]  Matthias Sperber,et al.  Dynamic Transcription for Low-Latency Speech Translation , 2016, INTERSPEECH.

[27]  Matthias Sperber,et al.  Lecture Translator - Speech translation framework for simultaneous lecture translation , 2016, NAACL.

[28]  Jordan L. Boyd-Graber,et al.  Don't Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation , 2014, EMNLP.

[29]  Srinivas Bangalore,et al.  Real-time Incremental Speech-to-Speech Translation of Dialogs , 2012, NAACL.

[30]  Alexander H. Waibel,et al.  Simultaneous translation of lectures and speeches , 2007, Machine Translation.

[31]  Alex Waibel,et al.  Testing generality in JANUS: a multi-lingual speech translation system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.