A Comparative Study on Feature Extraction Techniques for Language Identification

This paper presents a brief survey of feature extraction techniques used in language identification (LID) system. The objective of the language identification system is to automatically identify the specific language from a spoken utterance. Also the LID system must perform quickly and accurately. To fulfill this criteria the extraction of the features of acoustic signals is an important task because LID mainly depends on the language-specific characteristics. The efficiency of this feature extraction phase is important since it strongly affects the performance and quality of the system. There are different features which are used in LID are cepstral coefficients, MFCC, PLP, RASTA-PLP, etc.

[1]  D.P. Skinner,et al.  The cepstrum: A guide to processing , 1977, Proceedings of the IEEE.

[2]  Pawan Kumar,et al.  Spoken Language Identification Using Hybrid Feature Extraction Methods , 2010, ArXiv.

[3]  V. Tiwari MFCC and its applications in speaker recognition , 2010 .

[4]  Douglas A. Reynolds,et al.  Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..

[5]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[6]  W. B. Mikhael,et al.  Speaker verification/recognition and the importance of selective feature extraction: review , 2001, Proceedings of the 44th IEEE 2001 Midwest Symposium on Circuits and Systems. MWSCAS 2001 (Cat. No.01CH37257).

[7]  James R. Glass,et al.  Robust Speaker Recognition in Noisy Conditions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[9]  H. Ezzaidi,et al.  Pitch and MFCC dependent GMM models for speaker identification systems , 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513).

[10]  Mark D Skowronski,et al.  Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition. , 2004, The Journal of the Acoustical Society of America.

[11]  B.S. Atal,et al.  Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[12]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[13]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[14]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.