Spotting the Traces of Depression in Read Speech: An Approach Based on Computational Paralinguistics and Social Signal Processing

This work investigates the use of a classification approach as a means to identify effective depression markers in read speech, i.e., observable and measurable traces of the pathology in the way people read a predefined text. This is important because the diagnosis of depression is still a challenging problem and reliable markers can, at least to a partial extent, contribute to address it. The experiments have involved 110 individuals and revolve around the tendency of depressed people to read slower and display silences that are both longer and more frequent. The results show that features expected to capture such differences reduce the error rate of a baseline classifier by more than 50% (from 31.8% to 15.5%). This is of particular interest when considering that the new features are less than 10% of the original set (3 out of 32). Furthermore, the results appear to be in line with the findings of neuroscience about brain-level differences between depressed and non-depressed individuals.

[1]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[2]  M. B. Sheikman Handbook of Depression , 2004 .

[3]  A. Tasman,et al.  Changes in neural circuitry of language before and after treatment of major depression , 2002, Human brain mapping.

[4]  Arne Nagels,et al.  Increased neural activity during overt and continuous semantic verbal fluency in major depression: mainly a failure to deactivate , 2014, European Archives of Psychiatry and Clinical Neuroscience.

[5]  R. Bornstein,et al.  A meta-analysis of antidepressant outcome under "blinder" conditions. , 1992, Journal of consulting and clinical psychology.

[6]  Maria A Oquendo,et al.  Aggressivity, suicide attempts, and depression: relationship to cerebrospinal fluid monoamine metabolite levels , 2001, Biological Psychiatry.

[7]  Björn Schuller,et al.  Computational Paralinguistics , 2013 .

[8]  James R. Glass,et al.  Detecting Depression with Audio/Text Sequence Modeling of Interviews , 2018, INTERSPEECH.

[9]  Olga V. Demler,et al.  The epidemiology of major depressive disorder: results from the National Comorbidity Survey Replication (NCS-R). , 2003, JAMA.

[10]  Martin Keller,et al.  The epidemiology of major depressive episodes: results from the International Consortium of Psychiatric Epidemiology (ICPE) surveys , 2003, International journal of methods in psychiatric research.

[11]  A. Mitchell,et al.  Clinical diagnosis of depression in primary care: a meta-analysis , 2009, The Lancet.

[12]  Matthew Purver,et al.  Detecting Depression with Word-Level Multimodal Fusion , 2019, INTERSPEECH.

[13]  Ailbhe Ní Chasaide,et al.  The role of voice quality in communicating emotion, mood and attitude , 2003, Speech Commun..

[14]  Zhaocheng Huang,et al.  Depression Detection from Short Utterances via Diverse Smartphones in Natural Environmental Conditions , 2018, INTERSPEECH.

[15]  Thomas F. Quatieri,et al.  A review of depression and suicide risk assessment using speech analysis , 2015, Speech Commun..

[16]  E. Berndt,et al.  The economic burden of depression in 1990. , 1993, The Journal of clinical psychiatry.

[17]  Asaid Khateb,et al.  Variability of fMRI activation during a phonological and semantic language task in healthy subjects , 2004, Human brain mapping.

[18]  David A. Schoenfeld,et al.  The Problem of the Placebo Response in Clinical Trials for Psychiatric Disorders: Culprits, Possible Remedies, and a Novel Study Design Approach , 2003, Psychotherapy and Psychosomatics.

[19]  Thomas F. Quatieri,et al.  Vocal-Source Biomarkers for Depression: A Link to Psychomotor Activity , 2012, INTERSPEECH.

[20]  Vidhyasaharan Sethu,et al.  Analysis of acoustic space variability in speech affected by depression , 2015, Speech Commun..

[21]  D DeBrota,et al.  The problem of measurement error in multisite clinical trials. , 1998, Psychopharmacology bulletin.

[22]  Nicholas B. Allen,et al.  Detection of Clinical Depression in Adolescents’ Speech During Family Interactions , 2011, IEEE Transactions on Biomedical Engineering.

[23]  R. Kessler,et al.  Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States. Results from the National Comorbidity Survey. , 1994, Archives of general psychiatry.

[24]  D. Mitchell Wilkes,et al.  Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk , 2004, IEEE Transactions on Biomedical Engineering.

[25]  Thomas F. Quatieri,et al.  Detecting Depression using Vocal, Facial and Semantic Communication Cues , 2016, AVEC@ACM Multimedia.

[26]  M. Landau Acoustical Properties of Speech as Indicators of Depression and Suicidal Risk , 2008 .

[27]  Johan Sundberg,et al.  Differences in Ability of Musicians and Nonmusicians to Judge Emotional State from the Fundamental Frequency of Voice Samples , 1985 .

[28]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[29]  Rivka Levitan,et al.  Speech vs. text: A comparative analysis of features for depression detection systems , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[30]  Huaiyu Yang,et al.  Is there a placebo problem in antidepressant trials? , 2005, Current topics in medicinal chemistry.

[31]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[32]  Sun-Hwa Park,et al.  Decreased plasma BDNF level in depressive patients. , 2007, Journal of affective disorders.

[33]  I. Helmreich,et al.  A meta‐analysis of executive dysfunctions in unipolar major depressive disorder without psychotic symptoms and their changes during antidepressant treatment , 2012, Acta psychiatrica Scandinavica.

[34]  Jennifer M. Hensel,et al.  Economic Burden of Depression and Associated Resource Use in Manitoba, Canada , 2019, Canadian journal of psychiatry. Revue canadienne de psychiatrie.