论文信息 - Prosody Modelling for TTS Systems Using Statistical Methods

Prosody Modelling for TTS Systems Using Statistical Methods

The main drawback of older methods of prosody modelling is the monotony of the output, which is perceived as uncomfortable by the users, especially when listening to longer passages. The present paper proposes a prosodic generator designed to increase the variability of synthesized speech in reading devices for the blind. The method used is based on text segmentation into several prosodic patterns by means of vector quantisation and the subsequent training of corresponding HMMs (Hidden Markov Models) on F0 parameters. The path through the model's states is then used to generate sentence prosody. We also tried to utilize morphological information in order to increase prosody naturalness. The evaluation of the quality of the proposed prosodic generators was carried out by means of listening tests.

Petr Horák | Zdenek Chaloupka

[1] F. James Rohlf,et al. Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[2] J LoboL.M.R.,et al. Combination of Clustering, Classification & Association Rule based Approach for Course Recommender System in E-learning , 2012 .

[3] Hema A Murthy,et al. Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil , 2011, 2011 National Conference on Communications (NCC).

[4] J. Uhlir,et al. Speech Defect Analysis Using Hidden Markov Models , 2007 .

[5] C RajeswariK,et al. Prosody Modeling Techniques for Text-to-Speech Synthesis Systems - A Survey , 2012 .

[6] Thierry Dutoit,et al. Automatic prosody generation using suprasegmental unit selection , 1998, SSW.

[7] J. Laver,et al. The handbook of phonetic sciences , 1999 .

[8] Ralph B. D'Agostino,et al. Goodness-of-Fit-Techniques , 2020 .

[9] Petr Pajas,et al. TectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer , 2008, WMT@ACL.

[10] Elena Deza,et al. Dictionary of distances , 2006 .