Implementation of Concatenation Technique for Low Resource Text-To-Speech System Based on Marathi Talking Calculator

The indulgent acquaintance of mathematical basic concepts creates the pavement for numerous opportunities in life for every individual, including visually impaired people. The use of assertive technology for the disabled section of the society makes them more independent and avoid barriers in the field of education and employment. This research is focused to design an Android-based application i.e. talking Calculator for low resource based Marathi native language. The novelty of this work is to develop both, the application and the Marathi number corpus. Marathi is an Indo-Aryan language spoken by approximately 6.99 million speakers in India, which is the third widely spoken language after Bengali and Telugu but as they lack in linguistic resources, e.g. grammars, POS taggers, corpora, it falls into the category of low resource languages. The front end part of the application depicts the screen of a basic calculator with numerals displayed in Marathi. During runtime, each number is spoken as the specific key is pressed. It also speaks out the operation which is intended to be performed. The concatenation synthesis technique is applied to speak out the value of decimal places in the output number. The result is spoken out with proper place value of a digit in Marathi. The performance of the system is measured to the accuracy rate of 95.5%. The average run time complexity of the application is also calculated which is noted down to 2.64 sec. The feedback and review of the application is also taken from real end-user i.e. blind people.

[1]  C. R. Illingworth,et al.  On the Human Voice , 1876, Edinburgh medical journal.

[2]  Ratnadeep R. Deshmukh,et al.  Indian Language Speech Database: A Review , 2012 .

[3]  Bryan Duggan,et al.  Considerations in the usage of text to speech (TTS) in the creation of natural sounding voice enabled web systems , 2003, ISICT.

[4]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Antonio Cicchetti,et al.  Comparison of cross-platform mobile development tools , 2012, 2012 16th International Conference on Intelligence in Next Generation Networks.

[6]  Alexander Kain,et al.  Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Simon King,et al.  Festival 2 - build your own general purpose unit selection speech synthesiser , 2004, SSW.

[8]  I. Mattingly Phonetic Representation and Speech Synthesis by Rule , 1981 .

[9]  Fethi Jarray,et al.  Genetic approach for arabic part of speech tagging , 2013, ArXiv.

[10]  R. Chitturi,et al.  Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems , 2005 .

[11]  Matt Post,et al.  Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing , 2012, WMT@NAACL-HLT.

[12]  J. Werker,et al.  Phonemic and phonetic factors in adult cross-language speech perception. , 1984, The Journal of the Acoustical Society of America.

[13]  B. Lindblom,et al.  Numerical Simulation of Vowel Quality Systems: The Role of Perceptual Contrast , 1972 .

[14]  Steve Jacobs,et al.  The Future of the Android Operating System for Augmentative and Alternative Communication , 2011 .

[15]  Devon C. Duhaney,et al.  Assistive Technology: Meeting the Needs of Learners with Disabilities , 2000 .

[16]  Douglas D. O'Shaughnessy,et al.  Diphone speech synthesis , 1988, Speech Commun..

[17]  Illhoi Yoo,et al.  A Systematic Review of Healthcare Applications for Smartphones , 2012, BMC Medical Informatics and Decision Making.

[18]  Marc C. Beutnagel,et al.  The AT & T NEXT-GEN TTS system , 1999 .

[19]  Ronald A. Cole,et al.  The OGI multi-language telephone speech corpus , 1992, ICSLP.

[20]  Damien Lolive,et al.  The IRISA Text-To-Speech System for the Blizzard Challenge 2017 , 2015 .

[21]  T. Toda,et al.  The NAIST Text-to-Speech System for the Blizzard Challenge 2015 , 2015, The Blizzard Challenge 2015.

[22]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[23]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[24]  K. Hill The World Health Report 2000: Health Systems: Improving Performance , 2001 .

[25]  P. R. Panchamukhi,et al.  Higher Education in India: The Need for Change , 2006 .

[26]  Santosh Gaikwad,et al.  Design and Development of Marathi Speech Interface System , 2015, ACSS.

[27]  Kishore Prahallad,et al.  Unit size in unit selection speech synthesis , 2003, INTERSPEECH.

[28]  Mogens Allan Niss,et al.  MATHEMATICAL COMPETENCIES AND THE LEARNING OF MATHEMATICS: THE DANISH KOM PROJECT , 2003 .

[29]  Charansing N. Kayte,et al.  Performance Evaluation of Speech Synthesis Techniques for Marathi Language , 2016 .