Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners
暂无分享,去创建一个
[1] M. Ravanelli,et al. SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation , 2022, INTERSPEECH.
[2] David Jurgens,et al. ByT5 model for massively multilingual grapheme-to-phoneme conversion , 2022, INTERSPEECH.
[3] Boris Ginsburg,et al. Mixer-TTS: Non-Autoregressive, Fast and Compact Text-to-Speech Model Conditioned on Language Model Embeddings , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Adrian Lancucki,et al. One TTS Alignment to Rule Them All , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Daniel Tihelka,et al. T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion , 2021, Interspeech.
[6] Marco Nicolis,et al. Homograph disambiguation with contextual word embeddings for TTS systems , 2021, 11th ISCA Speech Synthesis Workshop (SSW 11).
[7] Boris Ginsburg,et al. Hi-Fi Multi-Speaker English TTS Dataset , 2021, Interspeech.
[8] Kevin J. Shih,et al. RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis , 2021 .
[9] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[10] Kyle Gorman,et al. Improving homograph disambiguation with supervised machine learning , 2018, LREC.
[11] Hideharu Nakajima,et al. Dataset Construction Method for Word Reading Disambiguation , 2018, PACLIC.
[12] Yu Hu,et al. Heteronym Verification for Mandarin Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.
[13] Mark A. Pitt,et al. The buckeye corpus of speech: updates and enhancements , 2007, INTERSPEECH.
[14] David Yarowsky,et al. Homograph Disambiguation in Text-to-Speech Synthesis , 1997 .
[15] K. Matsuoka,et al. Natural language processing in a Japanese text-to-speech system for written-style texts , 1996, Proceedings of IVTTA '96. Workshop on Interactive Voice Technology for Telecommunications Applications.