Breaking Age Barriers With Automatic Voice-Based Depression Detection

Adults over the age of 60 years are a rising population at-risk for depression, and there is a need to create automatic screening for this illness. Most existing voice-based depression datasets comprise speakers younger than 60 and variations in speech due to age and depression are not well understood. In this study, which uses Patient Health Questionnaires for depression severity ground-truth, automatic depression detection is explored using acoustic-based prosodic, spectral, landmark, and voice quality features derived from smartphone recordings from 152 speakers in four different age ranges (e.g., 18–34, 35–48, 49–62, and 63–79). An age-dependent modeling paradigm for voice-based depression detection is proposed and evaluated. Results show that age-dependent models improve voice-based automatic depression classification accuracy with up to 10% absolute gains when compared with an age-agnostic model. Further, when compared with age-agnostic and gender-dependent models, age-dependent models often produced greater depressed class identification f-score sensitivity (up to 0.39 absolute gains). While automatically extracted acoustic voice features lead to statistically significant depression detection accuracy gains over the age-agnostic modeling baseline (4%–9% absolute), manually extracted voice quality features also are useful (4%–7% absolute gains over baseline). This investigation demonstrates the benefits of implementing age modeling to improve voice-based depression screening via smart devices.

[1]  Elke A. Rundensteiner,et al.  Screening for Suicidal Ideation with Text Messages , 2021, 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI).

[2]  E. Fruchter,et al.  Sleep Monitoring Using WatchPAT Device to Predict Recurrence of Major Depression in Patients at High Risk for Major Depression Disorder Recurrence: A Case Report , 2021, Frontiers in Psychiatry.

[3]  Julien Epps,et al.  Automatic Elicitation Compliance for Short-Duration Speech Based Depression Detection , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Zhigeng Pan,et al.  A Convenient and Low-Cost Model of Depression Screening and Early Warning Based on Voice Data Using for Public Mental Health , 2021, International journal of environmental research and public health.

[5]  Johannes Zimmermann,et al.  Screening accuracy of a 14-day smartphone ambulatory assessment of depression symptoms and mood dynamics in a general population sample: Comparison with the PHQ-9 depression screening , 2021, PloS one.

[6]  World Population Prospects, The 2019 Revision - Volume I: Comprehensive Tables , 2019 .

[7]  C. Finck,et al.  The Effect of the Speech Task Characteristics on Perceptual Judgment of Mild to Moderate Dysphonia: A Methodological Study , 2018, Folia Phoniatrica et Logopaedica.

[8]  Stefan Scherer,et al.  A Cross-modal Review of Indicators for Depression Detection Systems , 2017, CLPsych@ACL.

[9]  Björn W. Schuller,et al.  Enhancing Speech-Based Depression Detection Through Gender Dependent Vowel-Level Formant Features , 2017, AIME.

[10]  Nicholas B. Allen,et al.  Statistical differences in speech acoustics of major depressed and non-depressed adolescents , 2015, 2015 9th International Conference on Signal Processing and Communication Systems (ICSPCS).

[11]  A. Young,et al.  High heterogeneity and low reliability in the diagnosis of major depression will impair the development of new drugs , 2015, BJPsych Open.

[12]  Louis-Philippe Morency,et al.  Reduced vowel space is a robust indicator of psychological distress: A cross-corpus analysis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  J. van der Linde,et al.  Vocal effectiveness of speech-language pathology students: Before and after voice use during service delivery , 2015, The South African journal of communication disorders = Die Suid-Afrikaanse tydskrif vir Kommunikasieafwykings.

[14]  John Kane,et al.  COVAREP — A collaborative voice analysis repository for speech technologies , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Tomi Kinnunen,et al.  A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Nicholas B. Allen,et al.  Detection of Clinical Depression in Adolescents’ Speech During Family Interactions , 2011, IEEE Transactions on Biomedical Engineering.

[17]  Emi Juliana Yamauchi,et al.  Perceptual evaluation of pathological voice quality: A comparative analysis between the RASATI and GRBASI scales , 2009, Logopedics, phoniatrics, vocology.

[18]  Irene Velsvik Bele Reliability in perceptual analysis of voice quality. , 2005, Journal of voice : official journal of the Voice Foundation.

[19]  J. Pennebaker,et al.  Language use of depressed and depression-vulnerable college students , 2004 .

[20]  G. Goodwin,et al.  Cognitive deficits in depression: Possible implications for functional neuropathology , 2001, British Journal of Psychiatry.

[21]  I. Hickie,et al.  Increased rate of psychosis and psychomotor change in depression with age , 1997, Psychological Medicine.

[22]  M. Hamilton A RATING SCALE FOR DEPRESSION , 1960, Journal of neurology, neurosurgery, and psychiatry.

[23]  Harriet J. Fell,et al.  SpeechMark: Landmark Detection Tool for Speech Analysis , 2012, INTERSPEECH.

[24]  Dimitra Vergyri,et al.  Using Prosodic and Spectral Features in Detecting Depression in Elderly Males , 2011, INTERSPEECH.

[25]  C. Bryant,et al.  Depression in older age: A scoping study , 2009 .

[26]  J. Coyne,et al.  Persistently poor outcomes of undetected major depression in primary care. , 1998, General hospital psychiatry.

[27]  P. Mueller,et al.  Effects of Aging on Speech and Voice , 1995 .

[28]  Y Kakudo,et al.  [Oral physiology]. , 1984, Shikai tenbo = Dental outlook.

[29]  P. Mueller,et al.  Needs and services in geriatric speech-language pathology and audiology. , 1981, ASHA.