An Investigation of Depressed Speech Detection: Features and Normalization

In recent years, the problem of automatic detection of mental illness from the speech signal has gained some initial interest, however questions remaining include how speech segments should be selected, what features provide good discrimination, and what benefits feature normalization might bring given the speaker-specific nature of mental disorders. In this paper, these questions are addressed empirically using classifier configurations employed in emotion recognition from speech, evaluated on a 47-speaker depressed/neutral read sentence speech database. Results demonstrate that (1) detailed spectral features are well suited to the task, (2) speaker normalization provides benefits mainly for less detailed features, and (3) dynamic information appears to provide little benefit. Classification accuracy using a combination of MFCC and formant based features approached 80% for this database.

[1]  Eliathamby Ambikairajah,et al.  Investigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition , 2010, Odyssey.

[2]  D Hell,et al.  The speech analysis approach to determining onset of improvement under antidepressants , 1998, European Neuropsychopharmacology.

[3]  D. Mitchell Wilkes,et al.  Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech , 2007, INTERSPEECH.

[4]  Nicholas B. Allen,et al.  Mel frequency cepstral feature and Gaussian Mixtures for modeling clinical depression in adolescents , 2009, 2009 8th IEEE International Conference on Cognitive Informatics.

[5]  A. Flint,et al.  Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. , 1993, Journal of psychiatric research.

[6]  D. Mitchell Wilkes,et al.  Acoustical properties of speech as indicators of depression and suicidal risk , 2000, IEEE Transactions on Biomedical Engineering.

[7]  Elliot Moore,et al.  Critical Analysis of the Impact of Glottal Features in the Classification of Clinical Depression in Speech , 2008, IEEE Transactions on Biomedical Engineering.

[8]  Nick Medford,et al.  Emotional memory for words: Separating content and context , 2007 .

[9]  Å. Nilsonne Speech characteristics as indicators of depressive illness , 1988, Acta psychiatrica Scandinavica.

[10]  Vidhyasaharan Sethu,et al.  Speaker dependency of spectral features and speech production cues for automatic emotion classification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  J. Darby,et al.  Speech and voice parameters of depression: a pilot study. , 1984, Journal of communication disorders.

[12]  D. Mitchell Wilkes,et al.  Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk , 2004, IEEE Transactions on Biomedical Engineering.

[13]  Roland Göcke,et al.  An approach for automatically measuring facial activity in depressed subjects , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[14]  M. Landau Acoustical Properties of Speech as Indicators of Depression and Suicidal Risk , 2008 .