Schwa-Deletion in Hindi Text-to-Speech Synthesis

We describe the phenomenon of schwa-deletion in Hindi and how it is handled in the pronunciation component of a multilingual concatenative text-to-speech system. Each of the consonants in written Hindi is associated with an “inherent” schwa vowel which is not represented in the orthography. For instance, the Hindi word pronounced as [namak] (‘salt’) is represented in the orthography using the consonantal characters for [n], [m], and [k]. Two main factors complicate the issue of schwa pronunciation in Hindi. First, not every schwa following a consonant is pronounced within the word. Second, in multimorphemic words, the presence of a morpheme boundary can block schwa deletion where it might otherwise occur. We propose a model for schwa-deletion which combines a general purpose schwa-deletion rule proposed in the linguistics literature (Ohala, 1983), with additional morphological analysis necessitated by the high frequency of compounds in our database. The system is implemented in the framework of finite-state transducer technology.

[1]  Michael C. Shapiro,et al.  Outline of Hindi Grammar , 1974 .

[2]  Richard Sproat,et al.  Multilingual text analysis for text-to-speech synthesis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Dennis H. Klatt,et al.  Software for a cascade/parallel formant synthesizer , 1980 .

[4]  Manjari Ohala Aspects of Hindi phonology , 1983 .

[5]  Richard Sproat,et al.  The bell labs German text-to-speech system: an overview , 1997, EUROSPEECH.

[6]  Richard Sproat Multilingual text analysis for text-to-speech synthesis , 1996, Nat. Lang. Eng..

[7]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[8]  Richard Shillcock,et al.  Proceedings of EUROSPEECH-1991. , 1991 .

[9]  Pramod Kumar Pandey,et al.  Word accentuation in Hindi , 1989 .

[10]  Nick Campbell,et al.  Optimising selection of units from speech databases for concatenative synthesis , 1995, EUROSPEECH.

[11]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[12]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[13]  Mehryar Mohri,et al.  A Rational Design for a Weighted Finite-State Transducer Library , 1997, Workshop on Implementing Automata.

[14]  No Value Proceedings of the 14th international congress of phonetic sciences , 2000 .

[15]  Richard Sproat,et al.  An Efficient Compiler for Weighted Rewrite Rules , 1996, ACL.

[16]  K. Samudravijaya,et al.  Indian accent text-to-speech system for web browsing , 2002 .

[17]  Jan P. H. van Santen,et al.  Modeling segmental duration in German text-to-speech synthesis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  Richard Sproat,et al.  RECENT ADVANCES IN MULTILINGUAL TEXT-TO-SPEECH SYNTHESIS , 2001 .

[19]  Jan P. H. van Santen,et al.  Assignment of segmental duration in text-to-speech synthesis , 1994, Comput. Speech Lang..

[20]  Richard Sproat,et al.  Text interpretation for TtS synthesis , 1997 .

[21]  R. J. Nelson,et al.  Introduction to Automata , 1968 .

[22]  Bernard Comrie,et al.  The World's Major Languages , 1987 .

[23]  Xavier A Furtado,et al.  Synthesis of unlimited speech in Indian languages using formant-based rules , 1996 .

[24]  P. V. S. Rao VOICE: An integrated speech recognition synthesis system for the Hindi language , 1993, Speech Commun..

[25]  Chilin Shih,et al.  Issues in Text-to-Speech Conversion for Mandarin , 1996, Int. J. Comput. Linguistics Chin. Lang. Process..

[26]  Rama Kant Agnihotri,et al.  Hindi morphology : a word-based description , 1997 .

[27]  Richard Sproat Multilingual Text-to-Speech Synthesis , 1997 .

[28]  Vishwas Udpikar,et al.  A text-to-speech system for application by visually handicapped and illiterate , 1994, ICSLP.

[29]  Bernd Möbius Corpus-based speech synthesis : Methods and challenges , 2000 .

[30]  George Anton Kiraz,et al.  Multilingual syllabification using weighted finite-state transducers , 1998, SSW.

[31]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[32]  Bernd Möbius,et al.  Name pronunciation in German text-to-speech synthesis , 1997, ANLP.