Maithili Text-to-Speech System

This paper discusses development of TTS system for Maithili language. Speech corpus spanning 5 hours borrowed from LDCIL, CIIL Mysore, and also of 3 hours is collected from native speakers in studio environment. As most Indian languages including Maithili are syllabic in nature, concatenative method is used for the purpose of speech generation taking syllable as a basic unit. To enhance naturalness of speech out, 1055 most frequently occurring words have been recorded and stored. The system supports UTF-16 for text input. C#.NET is used for development of interface. The speech database consists of 930 syllable (C*V) in total. Each position has 300 syllables and 10 independent vowels. 930 units of speech data is built from all three positions. Subjective Evaluation, MOS and MRT, are conducted by 10 native speakers. The quality of synthesized speech in terms of intelligibility and naturalness is evaluated to be approximately 84 percent. The relevance of the work lies in the fact that no TTS system exists for Maithili Language till date.

[1]  Firoj Alam,et al.  Text to speech for Bangla language using festival , 2007 .

[2]  Pankaj Dwivedi,et al.  On documenting low resourced Indian languages insights from Kanauji speech corpus , 2017 .

[3]  D H Klatt,et al.  Review of text-to-speech conversion for English. , 1987, The Journal of the Acoustical Society of America.

[4]  T Shreekanth,et al.  Text to Speech Synthesis of Hindi Language using Polysyllable Units , 2015 .

[5]  S. P. Kishore,et al.  Building Hindi and Telugu Voices using Festvox , 2022 .

[6]  Hema A Murthy,et al.  Design and Development of a Text-To-Speech Synthesizer for Indian Languages , .

[7]  S. R. Mahadeva Prasanna,et al.  A syllable-based framework for unit selection synthesis in 13 Indian languages , 2013, 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE).

[8]  Bidisha Sharma,et al.  Development of Assamese Text-to-speech synthesis system , 2015, TENCON 2015 - 2015 IEEE Region 10 Conference.

[9]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  Heiga Zen,et al.  Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Sridhar Krishna Nemala,et al.  Tools for the development of a Hindi speech synthesis system , 2004, SSW.