Identification of top-3 spoken Indian languages: An Ensemble learning-based approach

Speech recognition has developed considerably for English but there has not been much development in Indic languages. Speech Recognition in Indic languages is itself challenging which complicates even more in multilingual scenario. There is a pressing need for Indic speech recognition systems and a fully functional variant of the same is yet to be developed. One reason for this is the multi lingual nature of our country in addition to the complexity of the Indic languages. It is very much important to identify the language specific segments from multi lingual speech before attempting recognition. In this paper, we have presented a system to segregate the top 3 spoken languages in India encompassing English, Hindi and Bangla. We have experimented with segregation of Bangla alone from the 3 languages as well driven by the motivation that Bangla is our mother tongue. Experiments were performed on more than 24 hours of data and highest accuracies of 97.13% and 96.44% has been obtained in segregating Bangla from the rest and trilingual segregation respectively with MFCC-based features coupled with Ensemble learning-based classification.

[1]  Suman K. Mitra,et al.  Spoken Language Identification for Indian Languages Using Split and Merge EM Algorithm , 2007, PReMI.

[2]  Dong Wang,et al.  Phonetic Temporal Neural Model for Language Identification , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Santanu Phadikar,et al.  RECAL — A language identification system , 2017, 2017 International Conference on Signal Processing and Communication (ICSPC).

[4]  William M. Campbell,et al.  Acoustic, phonetic, and discriminative approaches to automatic language identification , 2003, INTERSPEECH.

[5]  C. Jeyalakshmi,et al.  Comparative analysis on the use of features and models for validating language identification system , 2017, 2017 International Conference on Inventive Computing and Informatics (ICICI).

[6]  Hari Krishna Vydana,et al.  Significance of neural phonotactic models for large-scale spoken language identification , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[7]  David A. Ross,et al.  Automatic Language Identification in music videos with low level audio and visual features , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Marc A. Zissnian LANGUAGE IDENTIFICATION USING PHONEME RECOGNITION AND PHONOTACTIC LANGUAGE MODELING , 1995 .

[10]  Santanu Phadikar,et al.  READ - A Bangla Phoneme Recognition System , 2016, FICTA.

[11]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[12]  Amarnath Bose,et al.  Graphene-based Microbial Fuel Cell Studies with Starch in sub-Himalayan Soils , 2017, Indonesian Journal of Electrical Engineering and Informatics (IJEEI).

[13]  Man-Hung Siu,et al.  Automatic language identification using discrete hidden Markov model , 2004, INTERSPEECH.