NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2023
暂无分享,去创建一个
[1] Boris Ginsburg,et al. Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition , 2023, arXiv.org.
[2] Boris Ginsburg,et al. Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator , 2023, INTERSPEECH 2023.
[3] José A. R. Fonollosa,et al. SHAS: Approaching optimal Segmentation for End-to-End Speech Translation , 2022, INTERSPEECH.
[4] Sandeep Subramanian,et al. NVIDIA NeMo’s Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21 , 2021, WMT.
[5] Mattia Antonino Di Gangi,et al. MuST-C: A multilingual corpus for end-to-end speech translation , 2021, Comput. Speech Lang..
[6] Emmanuel Dupoux,et al. VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation , 2021, ACL.
[7] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[8] Adrian La'ncucki. Fastpitch: Parallel Text-to-Speech with Pitch Prediction , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[10] Quoc V. Le,et al. Specaugment on Large Scale Datasets , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] A. Sanchís,et al. Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Boris Ginsburg,et al. NeMo: a toolkit for building AI applications using Neural Modules , 2019, ArXiv.
[13] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[14] Yannick Estève,et al. TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation , 2018, SPECOM.
[15] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[16] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[17] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[20] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[21] Ke M. Tran,et al. FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN , 2023, IWSLT.
[22] Barry Haddow,et al. SLTEV: Comprehensive Evaluation of Spoken Language Translation , 2021, EACL.
[23] Mauro Cettolo,et al. The IWSLT 2018 Evaluation Campaign , 2018, IWSLT.