论文信息 - Concatenation rules for demisyllable speech synthesis

Concatenation rules for demisyllable speech synthesis

A system for speech synthesis by rule is described which uses demisyllables (DSs) as phonetic units. The problem of concatenation is discussed in detail; the pertinent stage converts a string of phonetic symbols into a stream of speech parameter frames. For German about 1650 DSs are required to permit synthesizing a very large vocabulary. Synthesis is controlled by 18 rules which are used for splitting up the phonetic string into DSs, for selecting the DSs in such a way that the inventory size is minimized, and- last but not least - for concatenation. The quality and intelligibility of the synthetic signal is very good; in a subjective test the median word intelligibility dropped from 96.6% for a LPC vocoder to 92.1% for the DS synthesis, and the quality difference between the DS synthesis and ordinary vocoded speech was judged very small.

Wolfgang Hess | Helmut Dettweiler

[1] Thomas Schotola. On the use of demisyllables in automatic word recognition , 1984, Speech Commun..

[2] Michael Wagner. Automatic labelling of continuous speech with a given phonetic transcription using dynamic programming algorithms , 1981, ICASSP.

[3] Josef Heiler. Optimized frame selection for variable frame rate synthesis , 1982, ICASSP.

[4] O. Fujimura,et al. The syllable and speech synthesis , 1975 .

[5] P. Delattre,et al. From Acoustic Cues to Distinctive Features , 1968 .

[6] O. Fujimura. Temporal Organization of Articulatory Movements as a Multidimensional Phrasal Structure , 1981, Phonetica.

[7] S. Öhman. Coarticulation in VCV Utterances: Spectrographic Measurements , 1966 .

[8] John E. Markel,et al. Linear Prediction of Speech , 1976, Communication and Cybernetics.

[9] Dennis H. Klatt,et al. Software for a cascade/parallel formant synthesizer , 1980 .

[10] G Ruske. Demisyllables as processing units for automatic speech recognition and lexical access , 1987 .

[11] Joseph Olive,et al. A scheme for concatenating units for speech synthesis , 1980, ICASSP.