Stress Level Classification of Speech Using Euclidean Distance Metrics in a Novel Hybrid Multi-Dimensional Feature Space

Presently, automatic stress detection methods for speech employ a binary decision approach, deciding whether the speaker is or is not under stress. Since the amount of stress a speaker is under varies and can change gradually, a reliable stress level detection scheme becomes necessary to accurately assess the condition of the speaker. Such a capability is pertinent to a number of applications, such as for those personnel in law enforcement positions. Using speech and biometric data collected from a real-world, variable-stress level law enforcement training scenario, this study illustrates two methods for automatically assessing stress levels in speech using a hybrid multi-dimensional feature space comprised of frequency-based and Teager energy operator-based features. The first approach uses a nearest neighbor-type clustering scheme at the vowel token level to classify speech data into one of three levels of stress, yielding an overall error rate of 50.5%. The second approach employs accumulated Euclidean distance metric weighting at the sentence-level to yield a relative improvement of 12.1% in performance

[1]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[2]  John H. L. Hansen,et al.  Analysis and compensation of stressed and noisy speech with application to robust automatic recognition , 1988 .

[3]  H. M. Teager,et al.  Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract , 1990 .

[4]  H. Teager Some observations on oral air flow during phonation , 1980 .

[5]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[6]  John H. L. Hansen,et al.  Frequency band analysis for stress detection using a teager energy operator based feature , 2002, INTERSPEECH.

[7]  Bertram Schaf Chapter Five – Critical Bands , 1970 .

[8]  D Cairns,et al.  NONLINEAR ANALYSIS AND DETECTION OF SPEECH UNDER STRESSED CONDITIONS , 1994 .

[9]  John H. L. Hansen,et al.  Nonlinear analysis and classification of speech under stressed conditions , 1994 .

[10]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[11]  John H. L. Hansen,et al.  Effects of phoneme characteristics on TEO feature-based automatic stress detection in speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[13]  Xuejing Sun,et al.  Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  C. Spielberger,et al.  Evaluating Performance of Law Enforcement Personnel during a Stressful Training Scenario , 2004, Annals of the New York Academy of Sciences.