Automated Transcription System for Malayalam Language

Malayalam is one of the 22 scheduled languages in India with more than 130 million speakers. This paper presents a report on the development of a speaker independent, continuous transcription system for Malayalam. The system employs Hidden Markov Model (HMM) for acoustic modeling and Mel Frequency Cepstral Coefficient (MFCC) for feature extraction. It is trained with 21 male and female speakers in the age group ranging from 20 to 40 years. The system obtained a word recognition accuracy of 87.4% and a sentence recognition accuracy of 84%, when tested with a set of continuous speech data.

[1]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[2]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[3]  Kai-Fu Lee,et al.  Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990 .

[4]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[5]  Dimo Dimov,et al.  Experimental specifics of using HMM in isolated word speech recognition ( , 2005 .

[6]  P. Babu Anto,et al.  Speech Recognition of Isolated Malayalam Words Using Wavelet Features and Artificial Neural Network , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[7]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[8]  Li Deng,et al.  HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features , 1997, IEEE Trans. Speech Audio Process..

[9]  Elizabeth C. Behrman,et al.  Simulations of quantum neural networks , 2000, Inf. Sci..

[10]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[11]  Richard Lippmann,et al.  A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..

[12]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[13]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[14]  B. Juang,et al.  Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition , 2008 .

[15]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[16]  Cini Kurian,et al.  Speech recognition of Malayalam numbers , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[17]  Tan Yee Fan,et al.  A Tutorial on Support Vector Machine , 2009 .