Native Language Identification using i-vector

The task of determining a speaker's native language based only on his speeches in a second language is known as Native Language Identification or NLI. Due to its increasing applications in various domains of speech signal processing, this has emerged as an important research area in recent times. In this paper we have proposed an i-vector based approach to develop an automatic NLI system using MFCC and GFCC features. For evaluation of our approach, we have tested our framework on the 2016 ComParE Native language sub-challenge dataset which has English language speakers from 11 different native language backgrounds. Our proposed method outperforms the baseline system with an improvement in accuracy by 21.95% for the MFCC feature based i-vector framework and 22.81% for the GFCC feature based i-vector framework.

[1]  Venu Govindaraju,et al.  Accent classification in speech , 2005, Fourth IEEE Workshop on Automatic Identification Advanced Technologies (AutoID'05).

[2]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[3]  Florin Curelaru,et al.  Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).

[4]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[5]  Douglas A. Reynolds,et al.  Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.

[6]  Markus Schedl,et al.  I-Vectors for Timbre-Based Music Similarity and Music Artist Classification , 2015, ISMIR.

[7]  J.H.L. Hansen,et al.  An integrated approach to the detection and classification of accents/dialects for a spoken document retrieval system , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[8]  John H. L. Hansen,et al.  The Effect of Listener Accent Background on Accent Perception and Comprehension , 2006, EURASIP J. Audio Speech Music. Process..

[9]  Gerald Friedland,et al.  AN I-VECTOR BASED APPROACH FOR AUDIO SCENE DETECTION , 2013 .

[10]  Eduardo Coutinho,et al.  The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language , 2016, INTERSPEECH.

[11]  John H. L. Hansen,et al.  Foreign accent classification using source generator based prosodic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[12]  R. W. King,et al.  Automatic accent classification using artificial neural networks , 1993, EUROSPEECH.

[13]  Gerhard Widmer,et al.  CP-JKU SUBMISSIONS FOR DCASE-2016 : A HYBRID APPROACH USING BINAURAL I-VECTORS AND DEEP CONVOLUTIONAL NEURAL NETWORKS , 2016 .

[14]  Sridha Sridharan,et al.  i-vector Based Speaker Recognition on Short Utterances , 2011, INTERSPEECH.

[15]  John H. L. Hansen,et al.  Advances in phone-based modeling for automatic accent classification , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Ali A. Ghorbani,et al.  Accent Classification Using Support Vector Machine and Hidden Markov Model , 2003, Canadian Conference on AI.

[17]  Zhenhao Ge,et al.  Accent classification with phonetic vowel representation , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[18]  Haizhou Li,et al.  I-vectors in the context of phonetically-constrained short utterances for speaker verification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).