An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification

Abstract The effects of psychomotor retardation associated with clinical depression are linked to a reduction in variability in acoustic parameters. However, linguistic stress differences between non-depressed and clinically depressed individuals have yet to be investigated. In this paper, by examining vowel articulatory parameters, statistically significant differences in articulatory characteristics are found at a paraphonetic level. For articulatory characteristic features, tongue height and advancement in terms of ‘mid’ and ‘front’ vowel sets show similar depression classification performance trends for both the DAIC-WOZ (English) and AViD (German) databases. Considering linguistic stress feature components, for both databases, depressed speakers exhibit shorter vowel durations and less variance for ‘low’, ‘back’, and ‘rounded’ vowel positions. Results for the DAIC-WOZ and AViD datasets using a small set of linguistic stress based features derived from multiple vowel articulatory parameter sets show absolute, statistically significant, gains of 7% and 20% in two-class depression classification performance over baseline approaches. Linguistic stress feature results indicate that specific vowel set analysis provides better discrimination of clinically depressed and non-depressed speakers. Knowledge gleaned from this research allows the design of more effective automatic depression disorder classification systems.

[1]  Fabien Ringeval,et al.  AVEC 2017: Real-life Depression, and Affect Recognition Workshop and Challenge , 2017, AVEC@ACM Multimedia.

[2]  H Hollien,et al.  VOCAL INDICATORS OF PSYCHOLOGICAL STRESS , 1980, Annals of the New York Academy of Sciences.

[3]  Thomas F. Quatieri,et al.  Detecting Depression using Vocal, Facial and Semantic Communication Cues , 2016, AVEC@ACM Multimedia.

[4]  E. Carterette,et al.  Informal speech : alphabetic & phonemic texts with statistical analyses and tables , 1974 .

[5]  Manoel da Silva Filho,et al.  Voice disorder in systemic lupus erythematosus , 2017, PloS one.

[6]  Edward Flemming,et al.  Rosa's roses: reduced vowels in American English , 2004, Journal of the International Phonetic Association.

[7]  D. Fry Duration and Intensity as Physical Correlates of Linguistic Stress , 1954 .

[8]  Daniel Jones An outline of English phonetics , 1956 .

[9]  Matthew Purver,et al.  Linguistic Indicators of Severity and Progress in Online Text-based Therapy for Depression , 2014, CLPsych@ACL.

[10]  Stanley S. Newman,et al.  ANALYSIS OF SPOKEN LANGUAGE OF PATIENTS WITH AFFECTIVE DISORDERS , 1938 .

[11]  A. Beck,et al.  Comparison of Beck Depression Inventories -IA and -II in psychiatric outpatients. , 1996, Journal of personality assessment.

[12]  Tomi Kinnunen,et al.  A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  J. Hollenberg,et al.  Diagnosis, Treatment, Comorbidity, and Resource Utilization of Depressed Patients in a General Medical Practice , 2000, International journal of psychiatry in medicine.

[14]  J. Morris,et al.  The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer's disease , 2011, Alzheimer's & Dementia.

[15]  K. Scherer Vocal affect expression: a review and a model for future research. , 1986, Psychological bulletin.

[16]  Å. Nilsonne,et al.  Acoustic analysis of speech variables during depression and after improvement , 1987, Acta psychiatrica Scandinavica.

[17]  D J Widlöcher,et al.  Psychomotor retardation: clinical, theoretical, and psychometric aspects. , 1983, The Psychiatric clinics of North America.

[18]  K. D. de Jong The supraglottal articulation of prominence in English: linguistic stress as localized hyperarticulation. , 1995, The Journal of the Acoustical Society of America.

[19]  S H ELDRED,et al.  A linguistic evaluation of feeling states in psychotherapy. , 1958, Psychiatry.

[20]  O. Engstrand,et al.  Articulatory correlates of stress and speaking rate in Swedish VCV utterances. , 1988, The Journal of the Acoustical Society of America.

[21]  D. Fry Experiments in the Perception of Stress , 1958 .

[22]  Julia Hirschberg,et al.  “You’re as Sick as You Sound”: Using Computational Approaches for Modeling Speaker State to Gauge Illness and Recovery , 2010 .

[23]  Paul E. Croarkin,et al.  Psychomotor retardation in depression: Biological underpinnings, measurement, and treatment , 2011, Progress in Neuro-Psychopharmacology and Biological Psychiatry.

[24]  C. Bradshaw,et al.  Elongation of Pause-Time in Speech: A Simple, Objective Measure of Motor Retardation in Depression , 1976, British Journal of Psychiatry.

[25]  L Schlebusch Review: Depression and suicide. , 2005 .

[26]  T. Strine,et al.  The PHQ-8 as a measure of current depression in the general population. , 2009, Journal of affective disorders.

[27]  Louis-Philippe Morency,et al.  Reduced vowel space is a robust indicator of psychological distress: A cross-corpus analysis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.

[29]  M. Caligiuri,et al.  Motor and cognitive aspects of motor retardation in depression. , 2000, Journal of affective disorders.

[30]  P. Moses The Voice of Neurosis , 1954 .

[31]  J. Pierrehumbert,et al.  Intonational structure in Japanese and English , 1986, Phonology.

[32]  G PASINI,et al.  [Diagnosis of depression]. , 1955, Minerva medica.

[33]  P F OSTWALD,et al.  ACOUSTIC METHODS IN PSYCHIATRY. , 1965, Scientific American.

[34]  Thomas F. Quatieri,et al.  A review of depression and suicide risk assessment using speech analysis , 2015, Speech Commun..

[35]  Michael Wagner,et al.  Characterising depressed speech for classification , 2013, INTERSPEECH.

[36]  P. Fossati,et al.  Qualitative analysis of verbal fluency in depression , 2003, Psychiatry Research.

[37]  Björn W. Schuller,et al.  The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.

[38]  Å. Nilsonne Speech characteristics as indicators of depressive illness , 1988, Acta psychiatrica Scandinavica.

[39]  Gábor Kiss,et al.  Physiological and Cognitive Status Monitoring on the Base of Acoustic-Phonetic Speech Parameters , 2014, SLSP.

[40]  Thomas F. Quatieri,et al.  Phonologically-based biomarkers for major depressive disorder , 2011, EURASIP J. Adv. Signal Process..

[41]  M. Beckman,et al.  The Interplay Between Prosodic Structure and Coarticulation , 1993, Language and speech.

[42]  Gordon Parker,et al.  Melancholia : a disorder of movement and mood : a phenomenological and neurobiological review , 1996 .

[43]  Shimon Sapir,et al.  Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. , 2009, Journal of communication disorders.

[44]  P. Snyder,et al.  Acoustic characteristics of Parkinsonian speech: a potential biomarker of early disease progression and treatment , 2004, Journal of Neurolinguistics.

[45]  W Jassem,et al.  Acoustic Correlates of Stress , 1965, Language and speech.

[46]  Ozgur Yorbik,et al.  Clinical characteristics of depressive symptoms in children and adolescents with major depressive disorder. , 2004, The Journal of clinical psychiatry.

[47]  J F Greden Psychomotor monitoring: a promise being fulfilled? , 1993, Journal of psychiatric research.

[48]  M A Mines,et al.  Frequency of Occurrence of Phonemes in Conversational English , 1978, Language and speech.

[49]  Thomas F. Quatieri,et al.  Classification of depression state based on articulatory precision , 2013, INTERSPEECH.

[50]  D. Fry The Dependence of Stress Judgments on Vowel Formant Structure , 1965 .

[51]  Raymond D. Kent,et al.  Effects of Stress Contrasts on Certain Articulatory Parameters , 1971, Phonetica.

[52]  R. Spitzer,et al.  The PHQ-9: validity of a brief depression severity measure. , 2001, Journal of general internal medicine.

[53]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[54]  B. Carroll,et al.  Decrease in speech pause times with treatment of endogenous depression. , 1980, Biological psychiatry.

[55]  Dimitra Vergyri,et al.  The SRI AVEC-2014 Evaluation System , 2014, AVEC '14.

[56]  M. Barclay,et al.  Manic-Depressive Insanity and Paranoia , 1921, The Indian Medical Gazette.

[57]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[58]  Mohsen Bazargan,et al.  Depression symptomatology and diagnosis: discordance between patients and physicians in primary care settings. , 2008, BMC family practice.

[59]  Kenneth de Jong,et al.  Stress-related variation in the articulation of coda alveolar stops: flapping revisited , 1998 .

[60]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[61]  P. Sevestre,et al.  Speech and Psychopathology , 1985, Language and speech.

[62]  Louis-Philippe Morency,et al.  Investigating the speech characteristics of suicidal adolescents , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[63]  Roland Göcke,et al.  Elicitation Design for Acoustic Depression Classification: An Investigation of Articulation Effort, Linguistic Complexity, and Word Affect , 2017, INTERSPEECH.

[64]  M. Stolar Acoustic and conversational speech analysis of depressed adolescents and their parents , 2016 .

[65]  H Hollien,et al.  [Vocal and speech patterns of depressive patients]. , 1977, Folia phoniatrica.

[66]  Robert T. Knight,et al.  Spatiotemporal imaging of cortical activation during verb generation and picture naming , 2010, NeuroImage.

[67]  C. W. Carter,et al.  The words and sounds of telephone conversations , 1930 .

[68]  Björn W. Schuller,et al.  Enhancing Speech-Based Depression Detection Through Gender Dependent Vowel-Level Formant Features , 2017, AIME.

[69]  J. Darby,et al.  Speech and voice parameters of depression: a pilot study. , 1984, Journal of communication disorders.

[70]  Rebecca Hayden,et al.  The Relative Frequency of Phonemes in General-American English , 1950 .

[71]  Karl Heinz Ramers,et al.  Vokalquantität und -qualität im Deutschen , 1988 .

[72]  P. Ladefoged Three areas of experimental phonetics , 1967 .

[73]  Sylvia D. Kreibig,et al.  Autonomic nervous system activity in emotion: A review , 2010, Biological Psychology.

[74]  Raymond D. Kent Research on speech motor control and its disorders: a review and prospective. , 2000, Journal of communication disorders.

[75]  Björn W. Schuller,et al.  AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge , 2014, AVEC '14.

[76]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[77]  Michael Cannizzaro,et al.  Voice acoustical measurement of the severity of major depression , 2004, Brain and Cognition.

[78]  Julien Epps,et al.  Analysis of phonetic markedness and gestural effort measures for acoustic speech-based depression classification , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).

[79]  Roland Göcke,et al.  Diagnosis of depression by behavioural signals: a multimodal approach , 2013, AVEC@ACM Multimedia.

[80]  Raymond D. Kent,et al.  Impairment of speech intelligibility in men with amyotrophic lateral sclerosis. , 1990, The Journal of speech and hearing disorders.

[81]  J. Boucher,et al.  Articulation in early childhood autism , 1976, Journal of autism and childhood schizophrenia.

[82]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[83]  Alan Garnham,et al.  Slips of the tongue in the London-Lund corpus of spontaneous conversation , 1981 .

[84]  Lore Katharina Gerti Schultheiss Cross-Language Perception of German Vowels by Speakers of American English , 2008 .

[85]  A. Flint,et al.  Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. , 1993, Journal of psychiatric research.

[86]  Vincent L. Gracco,et al.  Neurobiology of Speech Production: A Motor Control Perspective , 2015 .

[87]  Klaus R. Scherer,et al.  Vocal indicators of mood change in depression , 1996 .

[88]  Vidhyasaharan Sethu,et al.  Analysis of acoustic space variability in speech affected by depression , 2015, Speech Commun..

[89]  D. Shapiro,et al.  Reduced facial expression and social context in major depression: discrepancies between facial muscle activity and self-reported emotion , 2000, Psychiatry Research.

[90]  L. Petitto,et al.  Biological Foundations of Language , 1967, Neurology.

[91]  John Crowe,et al.  Objective Methods for Reliable Detection of Concealed Depression , 2015, Front. ICT.

[92]  J. Mundt,et al.  Vocal Acoustic Biomarkers of Depression Severity and Treatment Response , 2012, Biological Psychiatry.

[93]  T. Pozzo,et al.  Psychomotor Retardation in Depression: A Systematic Review of Diagnostic, Pathophysiologic, and Therapeutic Implications , 2013, BioMed research international.

[94]  H. Sackeim,et al.  Psychomotor symptoms of depression. , 1997, The American journal of psychiatry.

[95]  J. Mundt,et al.  Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology , 2007, Journal of Neurolinguistics.

[96]  Klaus P. Ebmeier,et al.  Clinical and psychometric correlates of dopamine D2 binding in depression , 1997, Psychological Medicine.

[97]  Gabrielle Todd,et al.  A study using transcranial magnetic stimulation to investigate motor mechanisms in psychomotor retardation in depression. , 2008, The international journal of neuropsychopharmacology.

[98]  E Abberton,et al.  Voice pitch measurements in schizophrenia and depression , 1981, Psychological Medicine.

[99]  M. Herrero Botín [Language and communication]. , 1984, Revista de enfermeria.

[100]  C. F. Hockett A Course in Modern Linguistics , 1959 .

[101]  P. Ladefoged A course in phonetics , 1975 .

[102]  Roland Göcke,et al.  Investigating Word Affect Features and Fusion of Probabilistic Predictions Incorporating Uncertainty in AVEC 2017 , 2017, AVEC@ACM Multimedia.

[103]  Kallirroi Georgila,et al.  Verbal indicators of psychological distress in interactive dialogue with a virtual human , 2013, SIGDIAL Conference.

[104]  Roland Göcke,et al.  An Investigation of Emotional Speech in Depression Classification , 2016, INTERSPEECH.