Speaker identification in the shouted environment using Suprasegmental Hidden Markov Models

In this paper, Suprasegmental Hidden Markov Models (SPHMMs) have been used to enhance the recognition performance of text-dependent speaker identification in the shouted environment. Our speech database consists of two databases: our collected database and the Speech Under Simulated and Actual Stress (SUSAS) database. Our results show that SPHMMs significantly enhance speaker identification performance compared to Second-Order Circular Hidden Markov Models (CHMM2s) in the shouted environment. Using our collected database, speaker identification performance in this environment is 68% and 75% based on CHMM2s and SPHMMs, respectively. Using the SUSAS database, speaker identification performance in the same environment is 71% and 79% based on CHMM2s and SPHMMs, respectively.

[1]  Louis ten Bosch,et al.  Emotions, speech and the ASR framework , 2003, Speech Commun..

[2]  John H. L. Hansen,et al.  Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..

[3]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[4]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[5]  Sadaoki Furui Speaker-dependent-feature extraction, recognition and processing techniques , 1991, Speech Commun..

[6]  Ismail Shahin,et al.  Enhancing speaker identification performance under the shouted talking condition using second-order circular hidden Markov models , 2006, Speech Commun..

[7]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[8]  John H. L. Hansen,et al.  A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..

[9]  Y.-C. Zheng,et al.  Text-dependent speaker identification using circular hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[10]  David Escudero Mancebo,et al.  Analysis of prosodic features towards modelling of emotional and pragmatic attributes of speech , 2005, Proces. del Leng. Natural.

[11]  Alex Waibel,et al.  Detecting Emotions in Speech , 1998 .

[12]  I. Shahin Enhancing speaker identification performance using circular hidden Markov model , 2004, Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004..

[13]  Ismail Shahin Improving Speaker Identification Performance Under the Shouted Talking Condition Using the Second-Order Hidden Markov Models , 2005, EURASIP J. Adv. Signal Process..

[14]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[15]  Yeunung Chen,et al.  Cepstral domain talker stress compensation for robust speech recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[16]  Abdelaziz Kriouile,et al.  Automatic word recognition based on second-order hidden Markov models , 1994, IEEE Trans. Speech Audio Process..

[17]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[18]  S. R. Mahadeva Prasanna,et al.  Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system , 2005, IEEE Transactions on Speech and Audio Processing.

[19]  K E Cummings,et al.  Analysis of the glottal excitation of emotionally styled and stressed speech. , 1995, The Journal of the Acoustical Society of America.