论文信息 - Text-to-Speech Synthesis Using Found Data for Low-Resource Languages - 字舞流文

Text-to-Speech Synthesis Using Found Data for Low-Resource Languages

Text-to-Speech Synthesis Using Found Data for Low-Resource Languages

Erica Lindsay Cooper | Erica Cooper

[1] Simon King,et al. Attributing modelling errors in HMM synthesis by stepping gradually from natural to modelled speech , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Julia Hirschberg,et al. Acoustic-Prosodic Indicators of Deception and Trust in Interview Dialogues , 2018, INTERSPEECH.

[3] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .

[4] Julia Hirschberg,et al. Data Selection and Adaptation for Naturalness in HMM-Based Speech Synthesis , 2016, INTERSPEECH.

[5] Julia Hirschberg,et al. Acoustic/prosodic and lexical correlates of charismatic speech , 2005, INTERSPEECH.

[6] Louis C. W. Pols,et al. Frisian TTS, an example of bootstrapping TTS for minority languages , 2004, Speech Synthesis Workshop.

[7] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[8] Junichi Yamagishi,et al. Average-Voice-Based Speech Synthesis , 2006 .

[9] R. H. Bernacki,et al. Effects of noise on speech production: acoustic and perceptual analyses. , 1988, The Journal of the Acoustical Society of America.

[10] Alan W. Black,et al. Adaptation techniques for speech synthesis in under-resourced languages , 2010, SLTU.

[11] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.

[12] Matt Post,et al. The Language Demographics of Amazon Mechanical Turk , 2014, TACL.

[13] R. Kubichek,et al. Mel-cepstral distance measure for objective speech quality assessment , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.

[14] Simon King,et al. Evaluation of objective measures for intelligibility prediction of HMM-based synthetic speech in noise , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15] Marc Schröder,et al. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[16] Oliver Watts,et al. Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages from 'found' data: evaluation and analysis , 2013, SSW.

[17] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[18] Sabine Buchholz,et al. Automatic Sentence Selection from Speech Corpora Including Diverse Speech for Improved HMM-TTS Synthesis Quality , 2011, INTERSPEECH.

[19] Bogdan Orza,et al. The SWARA speech corpus: A large parallel Romanian read speech dataset , 2017, 2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD).

[20] Simon King,et al. A comparison of open-source segmentation architectures for dealing with imperfect data from the media in speech synthesis , 2014, INTERSPEECH.

[21] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.

[22] Julia Hirschberg,et al. Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis , 2007, INTERSPEECH.

[23] A.W. Black,et al. Unit selection without a phoneme set , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[24] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25] Alan W. Black,et al. Utterance Selection Techniques for TTS Systems Using Found Speech , 2016, SSW.

[26] Avashna Govender,et al. Objective measures to improve the selection of training speakers in HMM-based child speech synthesis , 2016, 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech).

[27] Simon King,et al. Thousands of Voices for HMM-Based Speech Synthesis–Analysis and Application of TTS Systems Built on Various ASR Corpora , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[28] Nivja H. Jong,et al. Praat script to detect syllable nuclei and measure speech rate automatically , 2009, Behavior research methods.

[29] Mickael Rouvier,et al. An open-source state-of-the-art toolbox for broadcast news diarization , 2013, INTERSPEECH.

[30] Xin Wang,et al. A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora , 2016, SSW.

[31] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.

[32] Alan W. Black,et al. Text to speech in new languages without a standardized orthography , 2013, SSW.