论文信息 - A phonologically motivated method of selecting non-uniform units

A phonologically motivated method of selecting non-uniform units

This paper describes a method for selecting units from a database of recorded speech, for use in a concatenative speech synthesiser. The simplest approach is to store one example of every possible unit. A more powerful method is to have multiple examples of each unit. The challenge for such a method is to provide an efficient means of selecting units from a practical inventory, to give the best approximation to the desired sequence in some clearly specified way. The method used in BT’s Laureate system uses mixed Nphone units. In theory such units could be of arbitrary size, but in practice they are constrained to a maximum of three phones. It dynamically generates the unit sequence based on a global cost. Units are selected using purely phonologically motivated criteria, without reference to acoustic features, either desired or available within the inventory.

Peter Jackson | Andrew P. Breen

[1] Carsten Jürgens,et al. A comparison of different speech units for the German TTS-system tubsy , 1995, EUROSPEECH.

[2] Peter Jackson,et al. Non-uniform unit selection and the similarity metric within BT's Laureate TTS system , 1998, SSW.

[3] Thierry Dutoit,et al. The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] A. C. Gimson,et al. An introduction to the pronunciation of English , 1991 .

[5] J. H. Page,et al. The Laureate text-to-speech system : architecture and applications , 1996 .

[6] John L. Arnott,et al. Synthesizing emotions in speech: is it time to get excited? , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.