Enhancing speaker identification performance under the shouted talking condition using second-order circular hidden Markov models

It is known that the performance of speaker identification systems is high under the neutral talking condition; however, the performance deteriorates under the shouted talking condition. In this paper, second-order circular hidden Markov models (CHMM2s) have been proposed and implemented to enhance the performance of isolated-word text-dependent speaker identification systems under the shouted talking condition. Our results show that CHMM2s significantly improve speaker identification performance under such a condition compared to the first-order left-to-right hidden Markov models (LTRHMM1s), second-order left-to-right hidden Markov models (LTRHMM2s), and the first-order circular hidden Markov models (CHMM1s). Under the shouted talking condition, our results show that the average speaker identification performance is 23% based on LTRHMM1s, 59% based on LTRHMM2s, and 60% based on CHMM1s. On the other hand, the average speaker identification performance under the same talking condition based on CHMM2s is 72%.

[1]  K E Cummings,et al.  Analysis of the glottal excitation of emotionally styled and stressed speech. , 1995, The Journal of the Acoustical Society of America.

[2]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Jianing Dai Isolated word recognition using Markov chain models , 1995, IEEE Trans. Speech Audio Process..

[5]  Biing-Hwang Juang,et al.  Mixture autoregressive hidden Markov models for speech signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[6]  John H. L. Hansen,et al.  A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  Ismail Shahin Improving Speaker Identification Performance Under the Shouted Talking Condition Using the Second-Order Hidden Markov Models , 2005, EURASIP J. Adv. Signal Process..

[9]  John H. L. Hansen,et al.  Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..

[10]  D Cairns,et al.  NONLINEAR ANALYSIS AND DETECTION OF SPEECH UNDER STRESSED CONDITIONS , 1994 .

[11]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[12]  Yeunung Chen,et al.  Cepstral domain talker stress compensation for robust speech recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[13]  John H. L. Hansen,et al.  Nonlinear analysis and classification of speech under stressed conditions , 1994 .

[14]  Abdelaziz Kriouile,et al.  Automatic word recognition based on second-order hidden Markov models , 1994, IEEE Trans. Speech Audio Process..

[15]  I. Shahin Enhancing speaker identification performance using circular hidden Markov model , 2004, Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004..

[16]  Y.-C. Zheng,et al.  Text-dependent speaker identification using circular hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[17]  John H. L. Hansen,et al.  The Impact of Speech Under `Stress''on Military Speech Technology , 2000 .

[18]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[19]  Ismail Shahin,et al.  Text-dependent speaker identification using hidden Markov model with stress compensation technique , 1998, Proceedings IEEE Southeastcon '98 'Engineering for a New Era'.

[20]  Ismail Shahin,et al.  Speaker identification using dynamic time warping with stress compensation technique , 1998, Proceedings IEEE Southeastcon '98 'Engineering for a New Era'.

[21]  Jean-François Mari,et al.  A second-order HMM for high performance word and phoneme-based continuous speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.