Current State of Czech Text-to-Speech System ARTIC

This paper gives a survey of the current state of ARTIC – the modern Czech concatenative corpus-based text-to-speech system All stages of the system design are described in the paper, including the acoustic unit inventory building process, text processing and speech production issues Two versions of the system are presented: the single unit instance system with the moderate output speech quality, suitable for low-resource devices, and the multiple unit instance system with a dynamic unit instance selection scheme, yielding the output speech of a high quality Both versions make use of the automatically designed acoustic unit inventories In order to assure the desired prosodic characteristics of the output speech, system-version-specific prosody generation issues are discussed here too Although the system was primarily designed for synthesis of Czech speech, ARTIC can now speak three languages: Czech (both female and male voices are available), Slovak and German.

[1]  Daniel Tihelka,et al.  German and Czech Speech Synthesis Using HMM-Based Speech Segment Database , 2002, TSD.

[2]  Jindrich Matousek,et al.  Formal Prosodic Structures and Their Application in NLP , 2005, TSD.

[3]  Jindrich Matousek,et al.  On Modelling Glottal Stop in Czech Text-to-Speech Synthesis , 2005, TSD.

[4]  Daniel Tihelka,et al.  Experiments with Automatic Segmentation for Czech Speech Synthesis , 2003, TSD.

[5]  Daniel Tihelka,et al.  Slovak Text-to-Speech Synthesis in ARTIC System , 2004, TSD.

[6]  Jan Zelinka,et al.  Automatic Transcription of Numerals in Inflectional Languages , 2005, TSD.

[7]  Daniel Tihelka,et al.  Hybrid syllable/triphone speech synthesis , 2005, INTERSPEECH.

[8]  Zbyne K. Tychtl Phase-mismatch-free and data efficient approach to natural sounding harmonic concatenative speech synthesis , 2004, 2004 12th European Signal Processing Conference.

[9]  Daniel Tihelka,et al.  Recent improvements on ARTIC: czech text-to-speech system , 2004, INTERSPEECH.

[10]  Milos Zelezný,et al.  Realistic Face Animation for a Czech Talking Head , 2004, TSD.

[11]  Daniel Tihelka,et al.  The analysis of synthetic speech distortions , 2004 .

[12]  Daniel Tihelka Symbolic prosody driven unit selection for highly natural synthetic speech , 2005, INTERSPEECH.

[13]  Jan Zelinka,et al.  Automatic numbers normalization in inflectional languages , 2005 .

[14]  Daniel Tihelka,et al.  Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction , 2003, INTERSPEECH.

[15]  Jan Romportl Structural Data-Driven Prosody Model for TTS Synthesis , 2006 .

[16]  Philip C. Woodland,et al.  A hidden Markov-model-based trainable speech synthesizer , 1999, Comput. Speech Lang..