Emirati-Accented Speaker Identification in Stressful Talking Conditions

This research is dedicated to improving “text-independent Emirati-accented speaker identification performance in stressful talking conditions” using three distinct classifiers: “First-Order Hidden Markov Models (HMM1s), Second-Order Hidden Markov Models (HMM2s), and Third-Order Hidden Markov Models (HMM3s).” The database that has been used in this work was collected from 25 per gender Emirati native speakers uttering eight widespread Emirati sentences in each of neutral, shouted, slow, loud, soft, and fast talking conditions. The extracted features of the captured database are called “Mel-Frequency Cepstral Coefficients (MFCCs).” Based on HMM1s, HMM2s, and HMM3s, average Emirati-accented “speaker identification accuracy in stressful conditions” is 58.6%, 61.1%, and 65.0%, respectively. The achieved “average speaker identification accuracy in stressful conditions based on HMM3s” is so similar to that attained in “subjective assessment by human listeners.”

[1]  Jianing Dai,et al.  Isolated word recognition using Markov chain models , 1995, IEEE Trans. Speech Audio Process..

[2]  Ismail Shahin,et al.  Emarati speaker identification , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[3]  Alessandra Russo,et al.  Multistyle classification of speech under stress using feature subset selection based on genetic algorithms , 2007, Speech Commun..

[4]  Sadaoki Furui Speaker-dependent-feature extraction, recognition and processing techniques , 1991, Speech Commun..

[5]  Ismail Shahin,et al.  Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments , 2013, International Journal of Speech Technology.

[6]  Ismail Shahin,et al.  Novel cascaded Gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments , 2018, Neural Computing and Applications.

[7]  Ismail Shahin,et al.  Emotion Recognition Using Hybrid Gaussian Mixture Model and Deep Neural Network , 2019, IEEE Access.

[8]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[9]  Khaled Shaalan,et al.  Speech Recognition Using Deep Neural Networks: A Systematic Review , 2019, IEEE Access.

[10]  Ismail Shahin Text-Independent Emirati-Accented Speaker Identification in Emotional Talking Environment , 2018, 2018 Fifth HCT Information Technology Trends (ITT).

[11]  Misko Subotic,et al.  Whispered speech recognition using deep denoising autoencoder , 2017, Eng. Appl. Artif. Intell..

[12]  Ismail Shahin,et al.  Emirati-accented speaker identification in each of neutral and shouted talking environments , 2018, Int. J. Speech Technol..

[13]  Ismail Shahin,et al.  Speaker identification investigation and analysis in unbiased and biased emotional talking environments , 2012, International Journal of Speech Technology.

[14]  Ismail Shahin,et al.  Emirati speaker verification based on HMMls, HMM2s, and HMM3s , 2017, 2016 IEEE 13th International Conference on Signal Processing (ICSP).

[15]  Ismail Shahin Improving Speaker Identification Performance Under the Shouted Talking Condition Using the Second-Order Hidden Markov Models , 2005, EURASIP J. Adv. Signal Process..

[16]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[17]  Abdelaziz Kriouile,et al.  Automatic word recognition based on second-order hidden Markov models , 1994, IEEE Trans. Speech Audio Process..

[18]  Ismail Shahin,et al.  Novel third-order hidden Markov models for speaker identification in shouted talking environments , 2014, Eng. Appl. Artif. Intell..

[19]  Ismail Shahin,et al.  Employing Emotion Cues to Verify Speakers in Emotional Talking Environments , 2017, J. Intell. Syst..

[20]  Jeff A. Bilmes,et al.  Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[21]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[22]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..