Comparing objective feature statistics of speech for classifying clinical depression

Human communication is saturated with emotional context that aids in interpreting a speakers mental state. Speech analysis research involving the classification of emotional states has been studied primarily with prosodic (e.g., pitch, energy, speaking rate) and/or spectral (e.g., formants) features. Glottal waveform features, while receiving less attention (due primarily to the difficulty of feature extraction), have also shown strong clustering potential of various emotional and stress states. This study provides a comparison of the major categories of speech analysis in the application of identifying and clustering feature statistics from a control group and a patient group suffering from a clinical diagnosis of depression.

[1]  J Sundberg,et al.  Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression. , 1988, The Journal of the Acoustical Society of America.

[2]  I R Titze,et al.  Unification of perturbation measures in speech signals. , 1990, The Journal of the Acoustical Society of America.

[3]  Mark A. Clements,et al.  Analysis, synthesis, and recognition of stressed speech , 1992 .

[4]  D. Mitchell Wilkes,et al.  Analysis of fundamental frequency for near term suicidal risk assessment , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[5]  D. Mitchell Wilkes,et al.  Acoustical properties of speech as indicators of depression and suicidal risk , 2000, IEEE Transactions on Biomedical Engineering.

[6]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[7]  J. Peifer,et al.  Investigating the role of glottal features in classifying clinical depression , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[8]  J. Peifer,et al.  Analysis of prosodic variation in speech for clinical depression , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[9]  Elliot Moore,et al.  Algorithm for automatic glottal waveform estimation without the reliance on precise glottal closure information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.