Comparisons of extreme learning machine and backpropagation-based i-vector approach for speaker identification

The extreme learning machine ELM is one of the machine learning applications used for regression and classification systems. In this paper, an extended comparison between an ELM and the backpropagation neural network BPNN -based i-vector is given in terms of a closed-set speaker identification task using 120 speakers from the TIMIT database. The system is composed of the mel frequency cepstal coefficient MFCC and power normalized cepstal coefficient PNCC approaches to form the feature extraction stage, while the cepstral mean variance normalization CMVN and feature warping are applied in order to mitigate the linear channel effect. The system is utilized with equal numbers of speakers of both genders with 120 speakers with eight dialects from the TIMIT database. The results demonstrate that the combination of the i-vector with the ELM for different features has the highest speaker identification accuracy SIA compared with the combination of the BPNN with the i-vector. The results also show that the i-vector with ELM approach is faster than the BPNN-based i-vector and it has the highest SIA.

[1]  Wai Lok Woo,et al.  Comparison of I-vector and GMM-UBM approaches to speaker identification with TIMIT and NIST 2008 databases in challenging environments , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[2]  Farrukh Nagi,et al.  Comparison of Supervised Learning Techniques for Non-Technical Loss Detection in Power Utility , 2012 .

[3]  Wai Lok Woo,et al.  Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases , 2017, 2017 5th International Workshop on Biometrics and Forensics (IWBF).

[4]  Rupa G. Mehta,et al.  Back-Propagated Neural Network on MapReduce Frameworks: A Survey , 2018, Smart Innovations in Communication and Computational Sciences.

[5]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[6]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[7]  Shifei Ding,et al.  Extreme learning machine and its applications , 2013, Neural Computing and Applications.

[8]  Bernard Widrow,et al.  New Trends of Learning in Computational Intelligence [Guest Editorial] , 2015, IEEE Comput. Intell. Mag..

[9]  Pradip K. Das,et al.  i-Vectors in speech processing applications: a survey , 2015, Int. J. Speech Technol..

[10]  E. S. Gopi Digital Speech Processing Using Matlab , 2013 .

[11]  Musatafa Abbas Abbood Albadr,et al.  Extreme learning machine: A review , 2017 .

[12]  Yuan Lan,et al.  An extreme learning machine approach for speaker recognition , 2012, Neural Computing and Applications.

[13]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Han Zhao,et al.  Extreme learning machine: algorithm, theory and applications , 2013, Artificial Intelligence Review.

[15]  Jonathon A. Chambers,et al.  Multi-dimensional I-vector closed set speaker identification based on an extreme learning machine with and without fusion technologies , 2017, 2017 Intelligent Systems Conference (IntelliSys).