Reduced vowel space is a robust indicator of psychological distress: A cross-corpus analysis

Reduced frequency range in vowel production is a well documented speech characteristic of individuals' with psychological and neurological disorders. Depression is known to influence motor control and in particular speech production. The assessment and documentation of reduced vowel space and associated perceived hypoarticulation and reduced expressivity often rely on subjective assessments. Within this work, we investigate an automatic unsupervised machine learning approach to assess a speaker's vowel space within three distinct speech corpora and compare observed vowel space measures of subjects with and without psychological conditions associated with psychological distress, namely depression, post-traumatic stress disorder (PTSD), and suicidality. Our experiments are based on recordings of over 300 individuals. The experiments show a significantly reduced vowel space in conversational speech for depression, PTSD, and suicidality. We further observe a similar trend of reduced vowel space for read speech. A possible explanation for a reduced vowel space is psychomotor retardation, a common symptom of depression that influences motor control and speech production.

[1]  M. Thase,et al.  Psychiatric rating scales. , 2012, Handbook of clinical neurology.

[2]  P. Kuhl,et al.  The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. , 2005, The Journal of the Acoustical Society of America.

[3]  G. Weismer,et al.  The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis. , 1995, Journal of speech and hearing research.

[4]  Å. Nilsonne Speech characteristics as indicators of depressive illness , 1988, Acta psychiatrica Scandinavica.

[5]  Louis-Philippe Morency,et al.  Investigating the speech characteristics of suicidal adolescents , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Kris Tjaden,et al.  Acoustic and perceptual consequences of articulatory rate change in Parkinson disease. , 2002, Journal of speech, language, and hearing research : JSLHR.

[7]  E. Blanchard,et al.  Psychometric properties of the PTSD Checklist (PCL). , 1996, Behaviour research and therapy.

[8]  Louis-Philippe Morency,et al.  Investigating voice quality as a speaker-independent indicator of depression and PTSD , 2013, INTERSPEECH.

[9]  Albert A. Rizzo,et al.  Adolescent suicidal risk assessment in clinician-patient interaction: A study of verbal and acoustic behaviors , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[10]  L. Hedges Distribution Theory for Glass's Estimator of Effect size and Related Estimators , 1981 .

[11]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[12]  N. Sampson,et al.  Mental Disorders, Comorbidity and Suicidal Behavior: Results from the National Comorbidity Survey Replication , 2009, Molecular Psychiatry.

[13]  A. Flint,et al.  Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. , 1993, Journal of psychiatric research.

[14]  D. Mitchell Wilkes,et al.  Acoustical properties of speech as indicators of depression and suicidal risk , 2000, IEEE Transactions on Biomedical Engineering.

[15]  Judith A. Hall,et al.  Nonverbal behavior in clinician—patient interaction , 1995 .

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Albert A. Rizzo,et al.  Automatic audiovisual behavior descriptors for psychological disorder analysis , 2014, Image Vis. Comput..

[18]  M. Landau Acoustical Properties of Speech as Indicators of Depression and Suicidal Risk , 2008 .

[19]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[20]  John Kane,et al.  COVAREP — A collaborative voice analysis repository for speech technologies , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Albert A. Rizzo,et al.  User-State Sensing for Virtual Health Agents and TeleHealth Applications , 2013, MMVR.

[22]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[23]  Christophe d'Alessandro,et al.  Improved differential phase spectrum processing for formant tracking , 2004, INTERSPEECH.

[24]  Jeffrey F. Cohn,et al.  Detecting Depression Severity from Vocal Prosody , 2013, IEEE Transactions on Affective Computing.

[25]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[26]  J. Darby,et al.  Speech and voice parameters of depression: a pilot study. , 1984, Journal of communication disorders.

[27]  Visar Berisha,et al.  Automatic assessment of vowel space area. , 2013, The Journal of the Acoustical Society of America.

[28]  Abeer Alwan,et al.  Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics , 2019, INTERSPEECH.

[29]  H. Sackeim,et al.  Psychomotor symptoms of depression. , 1997, The American journal of psychiatry.

[30]  M. Alpert,et al.  Reflections of depression in acoustic measures of the patient's speech. , 2001, Journal of affective disorders.

[31]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[32]  D. G. Campbell,et al.  Prevalence of Depression–PTSD Comorbidity: Implications for Clinical Practice Guidelines and Primary Care-based Interventions , 2007, Journal of General Internal Medicine.