Content based clinical depression detection in adolescents

This paper studies the effectiveness of speech contents for detecting clinical depression in adolescents. We also evaluated the performances of acoustic features such as Mel frequency cepstral coefficients (MFCC), short time energy (Energy), zero crossing rate (ZCR) and Teager energy operator (TEO) using Gaussian mixture models for depression detection. A clinical data set of speech from 139 adolescents, including 68 (49 girls and 19 boys) diagnosed as clinically depressed, was used in the classification experiments. Each subject participated in three 20 minutes interactions. The classification was first performed using the whole data and a smaller sub-set of data selected based on behavioural constructs defined by trained human observers (data with constructs). In the experiments, we found that the MFCC+Energy feature out performed the TEO feature. The results indicated that using the construct based speech contents in the problem solving interactions (PSI) session improved the detection accuracy. Accuracy was further improved by 4% when the gender dependent depression modelling technique was adopted. By using construct based PSI session speech content, gender based depression models achieved 65.1% average detection accuracy. Also, for both types of features (TEO and MFCC), the correct classification rates were higher for female speakers than for male speakers.

[1]  Elliot Moore,et al.  Critical Analysis of the Impact of Glottal Features in the Classification of Clinical Depression in Speech , 2008, IEEE Transactions on Biomedical Engineering.

[2]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[4]  Donald G. Childers,et al.  Speech processing and synthesis toolboxes , 1999 .

[5]  K. Scherer,et al.  Vocal indicators of psychiatric treatment effects in depressives and schizophrenics. , 1982, Journal of communication disorders.

[6]  H Hollien,et al.  [Vocal and speech patterns of depressive patients]. , 1977, Folia phoniatrica.

[7]  H. Hops,et al.  Family Processes in Adolescent Depression , 2001, Clinical child and family psychology review.

[8]  D. Mitchell Wilkes,et al.  Acoustical properties of speech as indicators of depression and suicidal risk , 2000, IEEE Transactions on Biomedical Engineering.

[9]  John A. Starkweather,et al.  Voice Quality Changes in Depression , 1964 .

[10]  H. Hops,et al.  Methodological issues in direct observation: Illustrations with the living in familial environments (LIFE) coding system , 1995 .

[11]  W. Avison,et al.  Gender differences in symptoms of depression among adolescents. , 1992, Journal of health and social behavior.

[12]  D. Mitchell Wilkes,et al.  Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk , 2004, IEEE Transactions on Biomedical Engineering.

[13]  J Sundberg,et al.  Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression. , 1988, The Journal of the Acoustical Society of America.

[14]  M. Landau Acoustical Properties of Speech as Indicators of Depression and Suicidal Risk , 2008 .

[15]  K. Scherer,et al.  Effect of experimentally induced stress on vocal parameters. , 1986, Journal of experimental psychology. Human perception and performance.

[16]  P. Moses The Voice of Neurosis , 1954 .

[17]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[18]  H. Teager Some observations on oral air flow during phonation , 1980 .

[19]  H. Hops,et al.  Interactional processes in families with depressed and non-depressed adolescents: reinforcement of depressive behavior. , 1998, Behaviour research and therapy.