Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation
暂无分享,去创建一个
Mauro Cettolo | Marco Gaido | Marco Turchi | Matteo Negri | M. Cettolo | Matteo Negri | M. Turchi | Marco Gaido | Marco Turchi
[1] Nadir Durrani,et al. FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN , 2020, IWSLT.
[2] Hermann Ney,et al. Automatic sentence segmentation and punctuation prediction for spoken language translation , 2006, IWSLT.
[3] Tomasz Potapczyk,et al. SRPOL’s System for the IWSLT 2020 End-to-End Speech Translation Task , 2020, IWSLT.
[4] Yannick Estève,et al. TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation , 2018, SPECOM.
[5] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[7] Matteo Negri,et al. On Target Segmentation for Direct Speech Translation , 2020, AMTA.
[8] Sylvain Meignier,et al. LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .
[9] Matteo Negri,et al. On Knowledge Distillation for Direct Speech Translation , 2020, CLiC-it.
[10] Jiajun Zhang,et al. End-to-End Speech Translation with Knowledge Distillation , 2019, INTERSPEECH.
[11] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[12] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[13] Alex Waibel,et al. Improving Sequence-To-Sequence Speech Recognition Training with On-The-Fly Data Augmentation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Mattia Antonino Di Gangi,et al. MuST-C: a Multilingual Speech Translation Corpus , 2019, NAACL.
[15] Florian Metze,et al. How2: A Large-scale Dataset for Multimodal Language Understanding , 2018, NIPS 2018.
[16] Peter Bell,et al. A semi-Markov model for speech segmentation with an utterance-break prior , 2014, INTERSPEECH.
[17] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yuan Cao,et al. Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Matteo Negri,et al. Adapting Transformer to End-to-End Spoken Language Translation , 2019, INTERSPEECH.
[20] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[23] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.
[24] Alexander M. Fraser,et al. Determining the placement of German verbs in English-to-German SMT , 2012, EACL.
[25] Matteo Negri,et al. Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? , 2020, IWSLT.
[26] Mauro Cettolo,et al. MMT: New Open Source MT for the Translation Industry , 2017 .
[27] Evgeny Matusov,et al. Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University , 2020, IWSLT.
[28] Alfons Juan-Císcar,et al. Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Marta R. Costa-jussà,et al. Findings of the 2019 Conference on Machine Translation (WMT19) , 2019, WMT.
[30] Jörg Tiedemann,et al. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Ashish Agarwal,et al. Hallucinations in Neural Machine Translation , 2018 .
[33] Mattia Antonino Di Gangi,et al. MuST-C: A multilingual corpus for end-to-end speech translation , 2021, Comput. Speech Lang..
[34] Jörg Tiedemann,et al. OPUS – parallel corpora for everyone , 2016, EAMT.