Time-frequency Analysis Based on Hilbert-Huang Transform for Depression Recognition in Speech

In recent years, automatic detection of depression from speech has attracted many researchers. One of the key points is finding discriminable patterns in voice between depressed patients and healthy people. For this goal, we employed the Hilbert-Huang transform (HHT) to implement time-frequency analysis. Speech signals were decomposed into different sub-band signals and further were transformed into energy-frequency features for analysis and detection of depression. In the experiment 124 participants’ (68 females and 56 males) speech were recorded in three patterns: interview, reading, and picture description for data collection. The results showed that the energy distribution of intrinsic mode functions (IMFs) between depressed patients and healthy people was significantly different, and this difference mainly was found in a relatively high-frequency range (1kHz). This finding fitted the clinical observation of depressed patients’ “energy loss”. Further, a speech-based depression classification model based on the above finding was built and validated on the dataset. The results showed classification accuracy was 75.5% and 71.2% for female and male, respectively and each specificity was 88.4% and 78.2% These results implied HHT-based energy-frequency feature is a promising indicator for automatic depression assessment.

[1]  M. Landau Acoustical Properties of Speech as Indicators of Depression and Suicidal Risk , 2008 .

[2]  Gabriel Rilling,et al.  Empirical mode decomposition as a filter bank , 2004, IEEE Signal Processing Letters.

[3]  R.G. Shiavi,et al.  Distinguishing depression and suicidal risk in men using GMM based frequency contents of affective vocal tract response , 2008, 2008 International Conference on Control, Automation and Systems.

[4]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[5]  D. Battle,et al.  Diagnostic and Statistical Manual of Mental Disorders (DSM). , 2013, CoDAS.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  M. Hamilton A RATING SCALE FOR DEPRESSION , 1960, Journal of neurology, neurosurgery, and psychiatry.

[8]  Nicholas B. Allen,et al.  Content based clinical depression detection in adolescents , 2009, 2009 17th European Signal Processing Conference.

[9]  R. Spitzer,et al.  The PHQ-9 , 2001, Journal of General Internal Medicine.

[10]  Zhenyu Liu,et al.  Speech pause time: A potential biomarker for depression detection , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[11]  KRAEPELIN ON MANIC‐DEPRESSIVE INSANITY AND PARANOIA , 1921 .

[12]  Willem J. M. Levelt,et al.  A theory of lexical access in speech production , 1999, Behavioral and Brain Sciences.

[13]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[14]  Lang He,et al.  Automated depression analysis using convolutional neural networks from speech , 2018, J. Biomed. Informatics.

[15]  Tamás D. Gedeon,et al.  A comparative study of different classifiers for detecting depression from spontaneous speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Nicholas B. Allen,et al.  Mel frequency cepstral feature and Gaussian Mixtures for modeling clinical depression in adolescents , 2009, 2009 8th IEEE International Conference on Cognitive Informatics.

[17]  M. Barclay,et al.  Manic-Depressive Insanity and Paranoia , 1921, The Indian Medical Gazette.

[18]  Dimitra Vergyri,et al.  Using Prosodic and Spectral Features in Detecting Depression in Elderly Males , 2011, INTERSPEECH.

[19]  M. First,et al.  Structured Clinical Interview for DSM-IV Axis I Disorders , 1997 .

[20]  Thomas F. Quatieri,et al.  Classification of depression state based on articulatory precision , 2013, INTERSPEECH.

[21]  S. Nolen-Hoeksema,et al.  The emergence of gender differences in depression during adolescence. , 1994, Psychological bulletin.

[22]  G. Dunbar,et al.  The Mini International Neuropsychiatric Interview (MINI). A short diagnostic structured interview: reliability and validity according to the CIDI , 1997, European Psychiatry.

[23]  Norden E. Huang,et al.  A review on Hilbert‐Huang transform: Method and its applications to geophysical studies , 2008 .

[24]  A. Alexander Beaujean,et al.  The relationship between cognitive ability and depression: a longitudinal data analysis , 2013, Social Psychiatry and Psychiatric Epidemiology.