论文信息 - Controlling Prominence Realisation in Parametric DNN-Based Speech Synthesis

Controlling Prominence Realisation in Parametric DNN-Based Speech Synthesis

This work aims to improve text-To-speech synthesis forWikipedia by advancing and implementing models of prosodic prominence. We propose a new system architecture with explicit prominence modeling a ...

Joakim Gustafson | Zofia Malisz | Jonas Beskow | Harald Berthelsen

[1] Petra Wagner,et al. Different parts of the same elephant: A roadmap to disentangle and connect different perspectives on prosodic prominence , 2015, ICPhS.

[2] Taniya Mishra,et al. Unsupervised prominence prediction for speech synthesis , 2013, INTERSPEECH.

[3] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.

[4] Mark Hasegawa-Johnson,et al. Models of dataset size, question design, and cross-language speech perception for speech crowdsourcing applications , 2015 .

[5] Petra Wagner,et al. Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis , 2011, INTERSPEECH.

[6] Petra Wagner,et al. The effect of priming on the correlations between prominence ratings and acoustic features , 2010 .

[7] Jens Edlund,et al. WikiSpeech - enabling open source text-to-speech for Wikipedia , 2016, SSW.

[8] Juraj Simko,et al. Hierarchical representation and estimation of prosody using continuous wavelet transform , 2017, Comput. Speech Lang..

[9] Tim Mahrt,et al. Crowd-sourcing prosodic annotation , 2017 .

[10] Klaus Zechner,et al. Using Crowdsourcing to Provide Prosodic Annotations for Non-Native Speech , 2011, INTERSPEECH.

[11] Okko Johannes Räsänen,et al. 3PRO - An unsupervised method for the automatic detection of sentence prominence in speech , 2016, Speech Commun..

[12] Pier Marco Bertinetto,et al. Prosodic prominence detection in Italian continuous speech using probabilistic graphical models , 2014 .

[13] Christian Jensen,et al. Choosing a scale for measuring perceived prominence , 2005, INTERSPEECH.

[14] Julia Hirschberg. Speech Synthesis: Prosody , 2006 .

[15] Fabio Tesser,et al. Experiments with signal-driven symbolic prosody for statistical parametric speech synthesis , 2013, SSW.

[16] Samer Al Moubayed,et al. Automatic Prominence Classification in Swedish , 2010 .

[17] Andrew Rosenberg,et al. Cross-Language Prominence Detection , 2012 .

[18] Andrew Rosenberg,et al. AutoBI - a tool for automatic toBI annotation , 2010, INTERSPEECH.

[19] Srikanth Ronanki,et al. Listening test materials for "A template-based approach for speech synthesis intonation generation using LSTMs" , 2016 .