Synthesis of Polysyllabic Sequences of Thai Tones Using a Generative Model of Fundamental Frequency Contours

In this paper, the distinctive tones of Thai in running speech are studied. We present rules to synthesize F0 contours of Thai tones in running speech by using the generative model of F0 contours. Along with our method, the pitch contours of Thai polysyllabic words, both disyllabic and trisyllabic words, were analyzed. The coarticulation effect of Thai tones in running speech were found. Based on the analysis of the polysyllabic words using this model, rules are derived and applied to synthesize Thai polysyllabic tone sequences. We performed listening tests to evaluate intelligibility of the rules for Thai tones generation. The average intelligibility scores became 98.8%, and 96.6% for disyllabic and trisyllabic words, respectively. From these result, the rule of the tones' generation was shown to be effective. Furthermore, we constructed the connecting rules to synthesize suprasegmental F0 contours using the trisyllable training rules' parameters. The parameters of the first, the third, and the second syllables were selected and assigned to the initial, the ending, and the remaining syllables in a sentence, respectively. Even such a simple rule, the synthesized phrases/senetences were completely identified in listening tests. The MOSs (Mean Opinion Score) was 3.50 while the original and analysis/synthesis samples were 4.82 and 3.59, respectively.

[1]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[2]  S. Imai Log-Magnitude Approximation (LMA) filter , 1980 .

[3]  Tomio Takara,et al.  A generative model of fundamental frequency contours for polysyllabic words of Thai tones , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Keikichi Hirose,et al.  Analysis and modeling of tonal features in polysyllabic words and sentences of the standard Chinese , 1990, ICSLP.

[5]  M. Harper,et al.  Contextual Variations in Trisyllabic Sequences of Thai Tones , 1997, Phonetica.

[6]  Phil Rose,et al.  Considerations in the normalisation of the fundamental frequency of linguistic tone , 1987, Speech Commun..

[7]  Siripong Potisuk,et al.  Prosodic disambiguation in automatic speech understanding of Thai , 1995 .

[8]  Siripong Potisuk,et al.  Tonal Coarticulation in Thai , 1994 .

[9]  S. Nash,et al.  Numerical methods and software , 1990 .

[10]  Mary P. Harper,et al.  Classification of Thai tone sequences in syllable-segmented speech using the analysis-by-synthesis method , 1999, IEEE Trans. Speech Audio Process..

[11]  Keikichi Hirose,et al.  Prosody and syntax in spoken sentences of standard Chinese , 1992, ICSLP.

[12]  Tomio Takara,et al.  Analysis of pitch contour of Thai tone using Fujisaki's model , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  H. Fujisaki,et al.  The use of a generative model of F/sub 0/ contours for multilingual speech synthesis , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[14]  Tomio Takara,et al.  Analysis and Synthesis of picg contour of Thai Tone Using Fujisaki's Model , 2003 .