Candidacy of Physiological Measurements for Implicit Control of Emotional Speech Synthesis

There is a need for speech synthesis to be more emotionally expressive. Implicit control of a subset of affective vocal effects could be advantageous for some applications. Physiological measures associated with autonomic nervous system (ANS) activity are potential candidates for such input. This paper describes a pilot study investigating physiological sensor readings as potential input signals for modulating the speech synthesis of affective utterances composed by human users. A small corpus of audio, heart rate, and skin conductance data has been collected from eight doctoral student oral defenses. Planned analysis and research phases are outlined.

[1]  Hartmut Traunmüller,et al.  Evidence for demodulation in speech perception , 2000, 6th International Conference on Spoken Language Processing (ICSLP 2000).

[2]  Christian Peter,et al.  Physiological sensing for affective computing , 2009, Affective Information Processing.

[3]  A. Manstead,et al.  Handbook of social psychophysiology , 1989 .

[4]  D. Beukelman,et al.  Augmentative & Alternative Communication: Supporting Children & Adults With Complex Communication Needs , 2006 .

[5]  E. H. Hutten SEMANTICS , 1953, The British Journal for the Philosophy of Science.

[6]  Nick Campbell Developments in Corpus-Based Speech Synthesis: Approaching Natural Conversational Speech , 2005, IEICE Trans. Inf. Syst..

[7]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[8]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[9]  M. Ephratt,et al.  Linguistic, paralinguistic and extralinguistic speech and silence , 2011 .

[10]  Rolf Pfeifer,et al.  How the Body Shapes the Way We Think: A New View of Intelligence (Bradford Books) , 2006 .

[11]  K. Scherer,et al.  Vocal expression of affect , 2005 .

[12]  S. Knardahl Cardiovascular psychophysiology , 2000, Annals of medicine.

[13]  K. Scherer Vocal correlates of emotional arousal and affective disturbance. , 1989 .

[14]  Christine L. Lisetti,et al.  Using Noninvasive Wearable Computers to Recognize Human Emotions from Physiological Signals , 2004, EURASIP J. Adv. Signal Process..

[15]  J. Cacioppo,et al.  Handbook Of Psychophysiology , 2019 .

[16]  Fernando Poyatos,et al.  Paralanguage, kinesics, silence, personal and environmental interaction , 2002 .

[17]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[18]  S. Schötz Linguistic & Paralinguistic Phonetic Variation in Speaker Recognition & Text-to-Speech Synthesis , 2002 .

[19]  Clare-Marie Karat,et al.  Conversational Speech Interfaces and Technologies , 2007 .

[20]  Rosalind W. Picard,et al.  A Wearable Sensor for Unobtrusive, Long-Term Assessment of Electrodermal Activity , 2010, IEEE Transactions on Biomedical Engineering.

[21]  Marc Schröder,et al.  Expressive Speech Synthesis: Past, Present, and Possible Futures , 2009, Affective Information Processing.

[22]  G. H. Monrad‐Krohn,et al.  Dysprosody or altered melody of language. , 1947, Brain : a journal of neurology.

[23]  N. Campbell,et al.  Voice Quality : the 4 th Prosodic Dimension , 2004 .

[24]  M. Dawson,et al.  The electrodermal system , 2007 .

[25]  吉島 茂,et al.  文化と言語の多様性の中のCommon European Framework of Reference for Languages: Learning, teaching, assessment (CEFR)--それは基準か? (第10回明海大学大学院応用言語学研究科セミナー 講演) , 2008 .

[26]  Rosalind W. Picard,et al.  A computational model for the automatic recognition of affect in speech , 2004 .

[27]  J. Lyons Semantics: Index of personal names , 1977 .

[28]  L. Munari How the body shapes the way we think — a new view of intelligence , 2009 .

[29]  P. Laukka,et al.  Communication of emotions in vocal expression and music performance: different channels, same code? , 2003, Psychological bulletin.

[30]  Tieniu Tan,et al.  Affective Information Processing , 2008 .