Transcribing Paralinguistic Acoustic Cues to Target Language Text in Transformer-Based Speech-to-Text Translation
暂无分享,去创建一个
Satoshi Nakamura | Katsuhito Sudoh | Sakriani Sakti | Hirotaka Tokuyama | S. Sakti | Satoshi Nakamura | Katsuhito Sudoh | Hirotaka Tokuyama
[1] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[2] Sergey Rybin,et al. You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation , 2020, 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI).
[3] Jordi Adell,et al. Prosody Generation for Speech-to-Speech Translation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[4] Tomoki Toda,et al. A method for translation of paralinguistic information , 2012, IWSLT.
[5] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[6] Ulrich Amsel. The Oxford Dictionary Of English Grammar , 2016 .
[7] Tomoki Toda,et al. Generalizing continuous-space translation of paralinguistic information , 2013, INTERSPEECH.
[8] Satoshi Nakamura,et al. Toward Multi-Features Emphasis Speech Translation: Assessment of Human Emphasis Production and Perception with Speech and Text Clues , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[9] A. Athanasiadou. On the subjectivity of intensifiers , 2007 .
[10] D. Willett,et al. Using Synthetic Audio to Improve the Recognition of Out-of-Vocabulary Words in End-to-End Asr Systems , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Eiichiro Sumita,et al. Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Panayiotis G. Georgiou,et al. Toward transfer of acoustic cues of emphasis across languages , 2013, INTERSPEECH.
[14] Heiga Zen,et al. Hidden Semi-Markov Model Based Speech Synthesis System , 2006 .
[15] Satoshi Nakamura,et al. Sequence-to-Sequence Models for Emphasis Speech Translation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Xiao Han,et al. Emotional Speech Recognition and Synthesis in Multiple Languages toward Affective Speech-to-Speech Translation System , 2014, 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.
[17] Peter Siemund,et al. Intensifiers and reflexives , 2000 .
[18] Renaat Declerck. A comprehensive descriptive grammar of English , 1991 .
[19] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[20] Tomoki Toda,et al. Preserving Word-Level Emphasis in Speech-to-Speech Translation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Alan W. Black,et al. Intent transfer in speech-to-speech machine translation , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[22] Eiichiro Sumita,et al. Comparative study on corpora for speech translation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[24] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[25] Melvin Johnson,et al. Direct speech-to-speech translation with a sequence-to-sequence model , 2019, INTERSPEECH.
[26] Raymond Chakhachiro. Contribution of prosodic and paralinguistic cues to the translation of evidentiary audio recordings , 2016 .