论文信息 - SPEECH SYNTHESIS FOR TEXT-TO-SPEECH ALIGNMENT AND PROSODIC FEATURE EXTRACTION

SPEECH SYNTHESIS FOR TEXT-TO-SPEECH ALIGNMENT AND PROSODIC FEATURE EXTRACTION

The aim of this paper is to present a new and promising approach of the text-to-speech alignment problem. For thi:j purpose, an original idea is developed : a high quality digital speech synthesizer is used to create a reference speech pattern used during the alignment process. The system has been used and tested to extract the prosodic [eatures 01 read French utterances. The results show a segmentation error rate of about 8%. This system will be ;I powerl'ul tool for the automatic creation of large prosodically labeled databases and for research on automatic prosody generation.

Boulevard Dolez

[1] Jae S. Lim,et al. Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[2] Bert Van Coile,et al. PROTRAN: a prosody transplantation tool for text-to-speech applications , 1994, ICSLP.

[3] Matthias Pätzold,et al. Analysis and synthesis of German F0 contours by means of Fujisaki's model , 1993, Speech Commun..

[4] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .

[5] Thierry Dutoit,et al. The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6] Frank K. Soong,et al. On the use of instantaneous and transitional spectral information in speaker recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[7] Biing-Hwang Juang,et al. On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[8] Julia Hirschberg. Using text analysis to predict intonational boundaries , 1991, EUROSPEECH.