Segmentation Techniques in Speech Synthesis

A basic method of speech synthesis is discussed in which segments of recorded utterances are joined together to produce continuous speech. The segments are characterized as (A) containing parts of two phones with their mutual influence in the middle of the segment, and (B) beginning and ending at the phonetically most stable position of each phone. All segments containing the same articulatory sequence have been defined as a dyad. The method of synthesis described involves not only the articulatory phones, but also the intonation, stress, and durational aspects of speech. Various techniques of obtaining the segments for speech synthesis are discussed. The method is limited to a specific dialect, and practically it is limited to a single speaker. A large number of segments is required to synthesize any arbitrarily selected utterance within these restrictions of dialect and speaker.