Concatenation rules for demisyllable speech synthesis

A system for speech synthesis by rule is described which uses demisyllables (DSs) as phonetic units. The problem of concatenation is discussed in detail; the pertinent stage converts a string of phonetic symbols into a stream of speech parameter frames. For German about 1650 DSs are required to permit synthesizing a very large vocabulary. Synthesis is controlled by 18 rules which are used for splitting up the phonetic string into DSs, for selecting the DSs in such a way that the inventory size is minimized, and- last but not least - for concatenation. The quality and intelligibility of the synthetic signal is very good; in a subjective test the median word intelligibility dropped from 96.6% for a LPC vocoder to 92.1% for the DS synthesis, and the quality difference between the DS synthesis and ordinary vocoded speech was judged very small.

[1]  Thomas Schotola On the use of demisyllables in automatic word recognition , 1984, Speech Commun..

[2]  Michael Wagner Automatic labelling of continuous speech with a given phonetic transcription using dynamic programming algorithms , 1981, ICASSP.

[3]  Josef Heiler Optimized frame selection for variable frame rate synthesis , 1982, ICASSP.

[4]  O. Fujimura,et al.  The syllable and speech synthesis , 1975 .

[5]  P. Delattre,et al.  From Acoustic Cues to Distinctive Features , 1968 .

[6]  O. Fujimura Temporal Organization of Articulatory Movements as a Multidimensional Phrasal Structure , 1981, Phonetica.

[7]  S. Öhman Coarticulation in VCV Utterances: Spectrographic Measurements , 1966 .

[8]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[9]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[10]  G Ruske Demisyllables as processing units for automatic speech recognition and lexical access , 1987 .

[11]  Joseph Olive,et al.  A scheme for concatenating units for speech synthesis , 1980, ICASSP.

[12]  Günther Ruske,et al.  An approach to speech recognition using syllabic decision units , 1978, ICASSP.

[13]  Catherine P. Browman Rules for demisyllable synthesis using Lingua, a language interpreter , 1980, ICASSP.

[14]  O. Fujimura,et al.  Syllable as a unit of speech recognition , 1975 .

[15]  W. Hess,et al.  A pitch-synchronous digital feature extraction system for phonemic recognition of speech , 1976 .

[16]  W. K. Endres,et al.  Speech Synthesis for an Unlimited Vocabulary, a Powerful Tool for Inquiry and Information Services , 1980 .

[17]  Osamu Fujimura,et al.  Syllables as concatenative phonetic units , 1982 .

[18]  Marian J. Macchi A phonetic dictionary for demisyllabic speech synthesis , 1980, ICASSP.

[19]  O. Fujimura Syllables as concatenated demisyllables and affixes , 1976 .

[20]  J. Allen,et al.  Synthesis of speech from unrestricted text , 1976, Proceedings of the IEEE.