Spoken Language Identification for Indian Languages Using Split and Merge EM Algorithm

Performance of Language Identification (LID) System using Gaussian Mixture Models (GMM) is limited by the convergence of Expectation Maximization (EM) algorithm to local maxima. In this paper an LID system is described using Gaussian Mixture Models for the extracted features which are then trained using Split and Merge Expectation Maximization Algorithm that improves the global convergence of EM algorithm. It improves the learning of mixture models which in turn gives better LID performance. A maximum likelihood classifier is used for classification or identifying a language. The superiority of the proposed method is tested for four languages.

[1]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[2]  Yeshwant K. Muthusamy,et al.  A Segmental Approach to Automatic Language Identification , 1993 .

[3]  Frédéric Bimbot,et al.  Language recognition using time-frequency principal component analysis and acoustic modeling , 2000, INTERSPEECH.

[4]  B. Yegnanarayana,et al.  Autoassociative neural network models for language identification , 2004, International Conference on Intelligent Sensing and Information Processing, 2004. Proceedings of.

[5]  Volker Tresp,et al.  Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging , 1995, NIPS.

[6]  Zhihua Zhang,et al.  EM algorithms for Gaussian mixtures with split-and-merge operation , 2003, Pattern Recognition.

[7]  Ronald A. Cole,et al.  A segment-based approach to automatic language identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Marc A. Zissman,et al.  Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Hema A. Murthy,et al.  Language identification using parallel syllable-like unit recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  H.A. Murthy,et al.  Automatic language identification and discrimination using the modified group delay feature , 2005, Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005..

[11]  Hsin-Min Wang,et al.  A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker Identification , 2004, EURASIP J. Adv. Signal Process..

[12]  Geoffrey E. Hinton,et al.  SMEM Algorithm for Mixture Models , 1998, Neural Computation.