Multilingual speaker recognition on Indian languages

In this paper we explore the performance of multilingual speaker recognition systems developed on the IITKGP-MLILSC speech corpus. Closed-set speaker identification and speaker verification experiments are individually conducted on 13 widely spoken Indian languages. In particular, we focus on the effect of language mismatch in the speaker recognition performance of individual languages and all languages together. The standard GMM-based speaker recognition framework is used. While the average language-independent speaker identification rate is as high as 95.21%, an average equal error rate of 11.71% shows scope for further improvement in speaker verification performance.

[1]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  V. Ramu Reddy,et al.  Identification of Indian languages using multi-level spectral and prosodic features , 2013, International Journal of Speech Technology.

[4]  Etienne Barnard,et al.  Language dependence in multilingual speaker verification , 2005 .

[5]  B. Yegnanarayana,et al.  Neural network classifiers for language identification using phonotactic and prosodic features , 2005, Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005..

[6]  U. Bhattacharjee,et al.  A multilingual speech database for speaker recognition , 2012, 2012 IEEE International Conference on Signal Processing, Computing and Control.

[7]  Q.Y. Hong,et al.  A discriminative training approach for text-independent speaker recognition , 2005, Signal Process..

[8]  Chin-Hui Lee,et al.  Speaker verification using normalized log-likelihood score , 1996, IEEE Trans. Speech Audio Process..

[9]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[10]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[11]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[12]  V. Ramu Reddy,et al.  Pitch synchronous and glottal closure based speech analysis for language recognition , 2013, Int. J. Speech Technol..

[13]  H. S. Jayanna,et al.  Multilingual Speaker Identification with the Constraint of Limited Data Using Multitaper MFCC , 2012, SNDS.

[14]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[15]  Roland Auckenthaler,et al.  Language dependency in text-independent speaker verification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[16]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[17]  S. Maity,et al.  IITKGP-MLILSC speech database for language identification , 2012, 2012 National Conference on Communications (NCC).