Development of Language Identification using Line Spectral Frequencies and Learning Vector Quantization Networks

Language identification system has become a very active research nowadays due to the need of intercultural human communication. This paper proposed a Language Identification System using Line Spectral Frequencies (LSF) and Linear Vector Quantization (LVQ) network. LSF was used due to its robustness compared to normal linear predictor coefficients (LPC), while LVQ was used due to its low complexity. Three languages, i.e. Arabic, Malay, and Thai, for both native male and female speakers were recorded at IIUM Recording Studio. Several experiments have been conducted to find the optimum parameters, i.e. sampling frequency (8000 Hz), LPC order (18), number of hidden layers (300), and learning rate (0.01). Results show that our proposed system is able to recognize the trained languages with the recognition rate of 73.8%. Further research could be conducted to improve the performance using different features, classifiers, or using deep learning neural network.

[1]  Yun Lei,et al.  Study of Senone-Based Deep Neural Network Approaches for Spoken Language Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2]  J. J. Parry,et al.  Linguistic mapping in LSF space for low-bit rate coding , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Teddy Surya Gunawan,et al.  On the Characteristics of Various Quranic Recitation for Lossless Audio Coding Application , 2016, 2016 International Conference on Computer and Communication Engineering (ICCCE).

[4]  Eliathamby Ambikairajah,et al.  Language Identification using Warping and the Shifted Delta Cepstrum , 2005, 2005 IEEE 7th Workshop on Multimedia Signal Processing.

[5]  John H. L. Hansen,et al.  Language recognition using deep neural networks with very limited training data , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Pawan Kumar,et al.  Spoken Language Identification Using Hybrid Feature Extraction Methods , 2010, ArXiv.

[7]  Haizhou Li,et al.  A hierarchical framework for language identification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Thomas Villmann,et al.  Can Learning Vector Quantization be an Alternative to SVM and Deep Learning? - Recent Trends and Advanced Variants of Learning Vector Quantization for Classification Learning , 2017, J. Artif. Intell. Soft Comput. Res..

[9]  Stephen J. Cox,et al.  Language Identification Using Visual Features , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Haizhou Li,et al.  Language Identification: A Tutorial , 2011, IEEE Circuits and Systems Magazine.

[11]  Bingxi Wang,et al.  Automatic Language Identification using Support Vector Machines , 2006, 2006 8th international Conference on Signal Processing.

[12]  Shrikanth S. Narayanan,et al.  Rapid Language Identification , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Li-Rong Dai,et al.  Improved language identification using deep bottleneck network , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Bin Ma,et al.  Spoken Language Recognition With Prosodic Features , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Ian McLoughlin,et al.  LSP parameter interpretation for speech classification , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).