A NEURAL MODEL OF SPEECH PRODUCTION AND SUPPORTING EXPERIMENTS

This paper describes the DIVA model of speech production and presents results of experiments designed to test and refine the model. According to the model, production of a phoneme or syllable starts with activation of a speech sound map cell (in left ventral premotor cortex) corresponding to the sound to be produced. This leads to production of the sound through two motor subsystems: a feedback control subsystem and a feedforward control subsystem. In the feedback control subsystem, signals from the premotor cortex travel to the auditory and somatosensory cortical areas through tuned synapses that encode sensory expectations for the sound being produced. These expectations take the form of time-varying auditory and somatosensory target regions. The target regions are compared to the current auditory and somatosensory state, and any discrepancy between the target and the current state leads to a corrective command signal to motor cortex. In the feedforward control subsystem, signals project from premotor cortex to primary motor cortex, both directly and via the cerebellum. These signals are tuned with practice by monitoring the commands from previous attempts to produce the sound, initially under feedback control. Feedforward and feedback-based control signals are combined in the model’s motor cortex to form the overall motor command. We present experimental results that support two theoretical characteristics of the model: its use of auditory target regions (including hypothesized effects of perceptual acuity on production target size), and its ability to achieve stable acoustic results using motor equivalent tradeoffs between articulatory gestures.

[1]  Satrajit S. Ghosh,et al.  A Model of Cortical and Cerebellar Function in Speech , 2003 .

[2]  G. Rizzolatti,et al.  Parietal cortex: from sight to action , 1997, Current Opinion in Neurobiology.

[3]  G. Rizzolatti,et al.  Localization of grasp representations in humans by PET: 1. Observation versus execution , 1996, Experimental Brain Research.

[4]  F H Guenther,et al.  Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. , 1995, Psychological review.

[5]  Frank H Guenther,et al.  The distinctness of speakers' /s/-/S/ contrast is related to their auditory discrimination and use of an articulatory saturation effect. , 2004, Journal of speech, language, and hearing research : JSLHR.

[6]  G. Rizzolatti,et al.  Premotor cortex and the recognition of motor actions. , 1996, Brain research. Cognitive brain research.

[7]  Shinji Maeda,et al.  Compensatory Articulation During Speech: Evidence from the Analysis and Synthesis of Vocal-Tract Shapes Using an Articulatory Model , 1990 .

[8]  Coarticulation • Suprasegmentals,et al.  Acoustic Phonetics , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[9]  Frank H. Guenther,et al.  A neural network model of speech acquisition and motor equivalent speech production , 2004, Biological Cybernetics.

[10]  F. Guenther,et al.  A theoretical investigation of reference frames for the planning of speech movements. , 1998, Psychological review.

[11]  C Y Espy-Wilson,et al.  Articulatory tradeoffs reduce acoustic variability during American English /r/ production. , 1999, The Journal of the Acoustical Society of America.

[12]  J. Perkell,et al.  The distinctness of speakers' productions of vowel contrasts is related to their discrimination of the contrasts. , 2004, The Journal of the Acoustical Society of America.

[13]  Raymond D. Kent,et al.  An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system. , 2000, Journal of speech, language, and hearing research : JSLHR.

[14]  B. Lindblom,et al.  Role of articulation in speech perception: clues from production. , 1996, The Journal of the Acoustical Society of America.

[15]  Björn Lindblom,et al.  Economy of Speech Gestures , 1983 .

[16]  Michael I. Jordan,et al.  Trading relations between tongue-body raising and lip rounding in production of the vowel /u/: a pilot "motor equivalence" study. , 1993, The Journal of the Acoustical Society of America.