论文信息 - High-accuracy automatic segmentation

High-accuracy automatic segmentation

We propose a system for automatically determining boundaries between phonetic segments in a speech wave given a phonetic transcription: automatic segmentation. The system uses edge detectors that are applied to various speech representations; both are optimized for each diphone or diphone class. Output from these detectors, which contains spuriously detected edges, is then combined with alternative pronunciations generated via rules from the canonical pronunciation. The nal output is generated with lowest-cost path algorithms applied to nite

Richard Sproat | Jan P. H. van Santen

[1] Colin W. Wightman,et al. The aligner: text to speech alignment using Markov models and a pronunciation dictionary , 1994, SSW.

[2] Andrej Ljolje,et al. Automatic segmentation of speech for TTS , 1993, EUROSPEECH.

[3] Richard Sproat,et al. Multilingual Text-to-Speech Synthesis: The Bell Labs Approach , 1998, CL.

[4] A. Jongman. Acoustics of American English Speech: A Dynamic Approach , 1995 .

[5] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .