A task-dynamic toolkit for modeling the effects of prosodic structure on articulation

The original task-dynamic model of speech production incorporated the theoretical tenets of Articulatory Phonology and provided a dynamics of inter-articulator coordination for single and co-produced constriction gestures, given a gestural score that specifies a time-dependent vector of gestural activations for a given utterance. More recently, the model has been significantly extended to provide a framework for investigating the higher order dynamics of prosodic phrasing, syllable structure, lexical stress, and the prominence (accentual) properties associated with higher level prosodic constituents (e.g., foot, word, phrase, sentence). There are two new components in the model. The first is an ensemble of gestural planning oscillators that defines a dynamics of gestural score formation in that, once the ensemble reaches an entrained steady-state of relative phasing, the waveform of each oscillator is used to specify the activation function of that oscillator's associated constriction gesture and to trigger, thereby, the onset of the gesture. The second component is a set of modulation gestures (∝-gestures) that, rather than activating constriction formation and release gestures in the vocal tract, serve to modulate the temporal and spatial properties of all concurrently active constriction gestures. Modulation gestures are of two types: temporal modulation gestures (∝T-gestures) that alter the rate of utterance timeflow by smoothly changing all frequency parameters of the planning oscillator ensemble; and spatial modulation gestures (∝S -gestures) that spatially strengthen or reduce the motions of constriction gestures by smoothly changing the spatial target parameters of these constriction gestures. Key to the representation of prosodic phrasing has been use of clockslowing temporal modulation gestures (called prosodic gestures [π-gestures] in previous work) that are locally active in the region of phrasal boundaries, and that slow the rate of utterance timeflow in direct proportion to the strength of the associated boundary. Central to the representation of syllable structure is the use of a coupling graph that defines the existence and strength of coupling in the network of gestural planning oscillators, and shapes the manner in which gestures are coordinated. Concepts from graph theory have been crucial to understanding how hypothesized differences among coupling graphs have correctly predicted empirically demonstrated intra-syllabic differences between onsets and codas in both the mean values and variabilities of C-C, C-V, and V-C timing patterns. In this paper, we describe a set of recent developments to our task-dynamic ‘toolkit’ (planning oscillator ensemble and temporal modulation gestures) and how they have been used to interpret and simulate experimental data on the interactions of stress and prominence in shaping the “prosodically driven phonetic detail” [14] of speech.

[1]  G. Fant,et al.  Auditory analysis and perception of speech , 1975 .

[2]  Heejin Kim,et al.  The stress foot as a unit of planned timing: evidence from shortening in the prosodic phrase , 2005, INTERSPEECH.

[3]  J. Devin McAuley,et al.  Effect of deviations from temporal expectations on tempo discrimination of isochronous tone sequences. , 1998, Journal of experimental psychology. Human perception and performance.

[4]  Plínio Almeida Barbosa,et al.  Explaining Cross-Linguistic Rhythmic Variability via a Coupled-Oscillator Model of Rhythm Production , 2002 .

[5]  Tommi Nieminen,et al.  COUPLED OSCILLATOR MODEL OF SPEECH RHYTHM , 1999 .

[6]  Dani Byrd,et al.  The elastic phrase: modeling the dynamics of boundary-adjacent lengthening , 2003, J. Phonetics.

[7]  Gérard Bailly,et al.  Motor Control for Speech Skills: a Connectionist Approach , 1991 .

[8]  Plínio Almeida Barbosa,et al.  From syntax to acoustic duration: A dynamical model of speech rhythm production , 2007, Speech Commun..

[9]  Robert F. Port,et al.  Rhythmic constraints on stress timing in English , 1998 .

[10]  D. Klatt Letter: Interaction between two factors that influence vowel duration. , 1973, The Journal of the Acoustical Society of America.

[11]  C. Browman,et al.  Competing constraints on intergestural coordination and self-organization of phonological structures , 2000 .

[12]  I. Lehiste The Timing of Utterances and Linguistic Boundaries , 1972 .

[13]  Gilbert Strang,et al.  Introduction to applied mathematics , 1988 .

[14]  Dani Byrd,et al.  The Elastic Phrase: Dynamics of Boundary-Adjacent Lengthening , 2003 .

[15]  B.E.F. Lindblom,et al.  Some Temporal Regularities of Spoken Swedish , 1975 .

[16]  C. Browman,et al.  Some Notes on Syllable Structure in Articulatory Phonology , 1988, Phonetica.

[17]  L Saltzman Elliot,et al.  A Dynamical Approach to Gestural Patterning in Speech Production , 1989 .

[18]  A. Eriksson,et al.  Aspects of Swedish speech rhythm , 1991 .

[19]  Eric Vatikiotis-Bateson,et al.  The articulatory dynamics of running speech: gestures from phonemes? , 1992, ICSLP.

[20]  A.W.F. Huggins,et al.  On Isochrony and Syntax , 1975 .

[21]  Gérard Bailly,et al.  Characterisation of rhythmic patterns for text-to-speech synthesis , 1994, Speech Communication.

[22]  Dani Byrd,et al.  The Distinctions Between State, Parameter and Graph Dynamics in Sensorimotor Control and Coordination , 2006 .

[23]  Taehong Cho,et al.  Prosodically driven phonetic detail in speech processing: The case of domain-initial strengthening in English , 2007, J. Phonetics.

[24]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[25]  Matthew P. Aylett,et al.  Proceedings of the XIVth International Congress of Phonetic Sciences , 1999 .

[26]  Dani Byrd,et al.  Action to Language via the Mirror Neuron System: The role of vocal tract gestural action units in understanding the evolution of phonology , 2006 .

[27]  Daniel Hirst,et al.  Modelling French Micromelody: Analysis and Synthesis , 1986 .

[28]  Dani Byrd,et al.  A Phase Window Framework for Articulatory Timing , 1996, Phonology.

[29]  M. Turvey,et al.  Diffusive, Synaptic, and Synergetic Coupling: An Evaluation Through In-Phase and Antiphase Rhythmic Movements. , 1996, Journal of motor behavior.

[30]  H. Haken,et al.  A theoretical model of phase transitions in human hand movements , 2004, Biological Cybernetics.

[31]  R. G. Coyle,et al.  Introduction to system dynamics , 1996 .

[32]  John Kingston,et al.  Papers in Laboratory Phonology: Index of names , 1990 .

[33]  Gérard Bailly,et al.  Formant trajectories as audible gestures: An alternative for speech synthesis , 1991 .

[34]  M. Mon-Williams,et al.  Motor Control and Learning , 2006 .

[35]  C A Fowler,et al.  Production and perception of coarticulation among stressed and unstressed vowels. , 1981, Journal of speech and hearing research.

[36]  M. Arbib Action to language via the mirror neuron system , 2006 .

[37]  D. Byrd C-Centers Revisited , 1995 .

[38]  Laura C. Dilley,et al.  Perceptual organization in intonational phonology : A test of parallelism , 2006 .

[39]  Stefanie Shattuck-Hufnagel,et al.  Word-boundary-related duration patterns in English , 2000, J. Phonetics.

[40]  Robert F. Port,et al.  The English voicing contrast as velocity perturbation , 1992, ICSLP.

[41]  Jan Edwards,et al.  Papers in Laboratory Phonology: Lengthenings and shortenings and the nature of prosodic constituency , 1990 .

[42]  Anders Löfqvist,et al.  Dynamics of intergestural timing: a perturbation study of lip-larynx coordination , 1998, Experimental Brain Research.

[43]  John F. Kolen,et al.  Resonance and the Perception of Musical Meter , 1994, Connect. Sci..

[44]  R. Kager A Metrical Theory of Stress and Destressing in English and Dutch , 1989 .

[45]  B. Hayes A metrical theory of stress rules , 1980 .

[46]  E. Large,et al.  The dynamics of attending: How people track time-varying events. , 1999 .

[47]  C. Stoel-Gammon,et al.  Phonetic inventories, 15-24 months: a longitudinal study. , 1985, Journal of speech and hearing research.

[48]  C. Browman,et al.  Papers in Laboratory Phonology: Tiers in articulatory phonology, with some implications for casual speech , 1990 .

[49]  S. Rossignol,et al.  Neural Control of Rhythmic Movements in Vertebrates , 1988 .

[50]  Dani Byrd,et al.  Task-dynamics of gestural timing: Phase windows and multifrequency rhythms , 2000 .

[51]  Frank Harary,et al.  Graph Theory , 2016 .

[52]  M. Ouellet,et al.  L'INTONATION, LE SYSTÈME DU FRANÇAIS : DESCRIPTION ET MODÉLISATION , 2000 .