Implementation of Three Text to Speech Systems for Kurdish Language

Nowadays, concatenative method is used in most modern TTS systems to produce artificial speech. The most important challenge in this method is choosing appropriate unit for creating database. This unit must warranty smoothness and high quality speech, and also, creating database for it must reasonable and inexpensive. For example, syllable, phoneme, allophone, and, diphone are appropriate units for all-purpose systems. In this paper, we implemented three synthesis systems for Kurdish language based on syllable, allophone, and diphone and compare their quality using subjective testing.

[1]  Allan Ramsay,et al.  Towards including prosody in a text-to-speech system for modern standard Arabic , 2008, Comput. Speech Lang..

[2]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[3]  Leon G. Higley,et al.  Forensic Entomology: An Introduction , 2009 .

[4]  Eric Keller,et al.  Fundamentals of speech synthesis and speech recognition: basic concepts, state-of-the-art and future challenges , 1995 .

[5]  Thierry Dutoit,et al.  High Quality Text-To-Speech Synthesis of the French Language , 2003 .

[6]  Moustafa Elshafei,et al.  Techniques for high quality Arabic speech synthesis , 2002, Inf. Sci..

[7]  Chung-Hsien Wu,et al.  Speech Activated Telephony E-mail Reader (SATER) Based On Speaker Verification And Text-to-speech Conversion , 1997, 1997 International Conference on Consumer Electronics.

[8]  Noel Massey,et al.  A high quality text-to-speech system composed of multiple neural networks , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  R. Sproat,et al.  Emu: an e-mail preprocessor for text-to-speech , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[10]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  T. Aaron Gulliver,et al.  A speech synthesizer for Persian text using a neural network with a smooth ergodic HMM , 2005, TALIP.

[12]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[13]  Ann K. Syrdal,et al.  Diphone synthesis using unit selection , 1998, SSW.

[14]  Richard Sproat,et al.  Multilingual Text-to-Speech Synthesis: The Bell Labs Approach , 1998, CL.

[15]  Kyuchul Yoon,et al.  A prosodic phrasing model for a Korean text-to-speech synthesis system , 2006, Comput. Speech Lang..

[16]  Hamid Reza Abutalebi,et al.  Implementation of a text-to-speech system for farsi language , 2000, INTERSPEECH.

[17]  Wafa Barkhoda,et al.  Implementation of a Text-to-Speech System for Kurdish Language , 2009, 2009 Fourth International Conference on Digital Telecommunications.

[18]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[19]  Richard Sproat Multilingual Text-to-Speech Synthesis , 1997 .

[20]  Marc C. Beutnagel,et al.  The AT & T NEXT-GEN TTS system , 1999 .

[21]  Thierry Dutoit,et al.  The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[22]  Ossama Emam,et al.  An Arabic TTS System Based on the IBM Trainable Speech Synthesizer , 2004 .

[23]  F. Chouireb,et al.  DEVELOPMENT OF A PROSODIC DATABASE FOR STANDARD ARABIC , 2007 .