Enhancing Speech-to-Speech Translation with Multiple TTS Targets
暂无分享,去创建一个
Shinji Watanabe | J. Pino | H. Inaguma | Jiatong Shi | Changhan Wang | Ann Lee | Yun Tang
[1] J. Niehues,et al. LibriS2S: A German-English Speech-to-Speech Translation Corpus , 2022, LREC.
[2] Yossi Adi,et al. Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation , 2022, INTERSPEECH.
[3] A. Conneau,et al. Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation , 2022, INTERSPEECH.
[4] Andy T. Liu,et al. SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities , 2022, ACL.
[5] Michelle Tadmor Ramanovich,et al. CVSS Corpus and Massively Multilingual Speech-to-Speech Translation , 2022, LREC.
[6] H. Schwenk,et al. Textless Speech-to-Speech Translation on Real Data , 2021, NAACL.
[7] Tomoki Toda,et al. S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Michelle Tadmor Ramanovich,et al. Translatotron 2: High-quality direct speech-to-speech translation with voice preservation , 2021, ICML.
[9] A. Polyak,et al. Direct Speech-to-Speech Translation With Discrete Units , 2021, ACL.
[10] Shinji Watanabe,et al. ESPnet2-TTS: Extending the Edge of TTS Research , 2021, ArXiv.
[11] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Jungil Kong,et al. Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech , 2021, ICML.
[13] Andy T. Liu,et al. SUPERB: Speech processing Universal PERformance Benchmark , 2021, Interspeech.
[14] Eugene Kharitonov,et al. Speech Resynthesis from Discrete Disentangled Self-Supervised Representations , 2021, Interspeech.
[15] Emmanuel Dupoux,et al. On Generative Spoken Language Modeling from Raw Audio , 2021, Transactions of the Association for Computational Linguistics.
[16] Satoshi Nakamura,et al. Transformer-Based Direct Speech-To-Speech Translation with Transcoder , 2021, 2021 IEEE Spoken Language Technology Workshop (SLT).
[17] Guillaume Fuchs,et al. StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Tie-Yan Liu,et al. UWSpeech: Speech to Speech Translation for Unwritten Languages , 2020, AAAI.
[19] Tie-Yan Liu,et al. FastSpeech 2: Fast and High-Quality End-to-End Text to Speech , 2020, ICLR.
[20] Kenneth Heafield,et al. Direct simultaneous speech to speech translation , 2021, ArXiv.
[21] Jaehyeon Kim,et al. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , 2020, NeurIPS.
[22] Heiga Zen,et al. Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling , 2020, ArXiv.
[23] Shinji Watanabe,et al. DiscreTalk: Text-to-Speech as a Machine Translation Problem , 2020, ArXiv.
[24] Marjan Ghazvininejad,et al. Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.
[25] Ryuichi Yamamoto,et al. Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] K. Takeda,et al. Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Melvin Johnson,et al. Direct speech-to-speech translation with a sequence-to-sequence model , 2019, INTERSPEECH.
[28] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[29] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[30] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Tomoki Toda,et al. Preserving Word-Level Emphasis in Speech-to-Speech Translation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[32] Ralph Roskies,et al. Bridges: a uniquely flexible HPC resource for new communities and data analytics , 2015, XSEDE.
[33] Satoshi Nakamura,et al. Multilingual Speech-to-Speech Translation System: VoiceTra , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.
[34] Jordi Adell,et al. Prosody Generation for Speech-to-Speech Translation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[35] Eiichiro Sumita,et al. Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.
[36] Enrique Vidal,et al. Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.