An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system.

The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is that it is unclear what teaching signal could specify constriction location and degree so that a mapping between constriction target space and articulator space can be learned. It is predicted that a model trained by auditory feedback will accomplish speech goals, in auditory target space, by continuously learning to use different articulator configurations to adapt to the changing acoustic properties of the vocal tract during development. The Maeda articulatory synthesis part of the DIVA neural network model (Guenther et al., 1998) was modified to reflect the development of the vocal tract by using measurements taken from MR images of children. After training, the model was able to maintain the 11 English vowel targets in auditory planning space, utilizing varying articulator configurations, despite morphological changes that occur during development. The vocal-tract constriction pattern (derived from the vocal-tract area function) as well as the formant values varied during the course of development in correspondence with morphological changes in the structures involved with speech production. Despite changes in the acoustical properties of the vocal tract that occur during the course of development, the model was able to demonstrate motor-equivalent speech production under lip-restriction conditions. The model accomplished this in a self-organizing manner even though there was no prior experience with lip restriction during training.

[1]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[2]  W F Katz,et al.  Acoustic analysis of compensatory articulation in children. , 1988, The Journal of the Acoustical Society of America.

[3]  Raymond D. Kent The Speech Sciences , 1997 .

[4]  Louis Goldstein,et al.  Gestural specification using dynamically-defined articulatory structures , 1990 .

[5]  A. Meltzoff,et al.  Infant vocalizations in response to speech: vocal imitation and developmental change. , 1996, The Journal of the Acoustical Society of America.

[6]  N. A. Bernshteĭn,et al.  Human motor actions : Bernstein reassessed , 1984 .

[7]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[8]  James Lubker,et al.  Formant frequencies of some fixed-mandible vowels and a model of speech motor programming by predictive simulation , 1979 .

[9]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[10]  Frank H. Guenther,et al.  An auditory‐feedback‐based model of speech production in the developing child , 1998 .

[11]  Louis Goldstein,et al.  Representation and reality: physical systems and phonological structure , 1990 .

[12]  J. Werker,et al.  Infants listen for more phonetic detail in speech perception than in word-learning tasks , 1997, Nature.

[13]  P. Huttenlocher Morphometric study of human cerebral cortex development , 1990, Neuropsychologia.

[14]  J. Werker,et al.  Developmental changes in perception of nonnative vowel contrasts. , 1994, Journal of experimental psychology. Human perception and performance.

[15]  J W Folkins,et al.  Variability of lip and jaw movements in children and adults: implications for the development of speech motor control. , 1985, Journal of speech and hearing research.

[16]  C. Fowler An event approach to the study of speech perception from a direct realist perspective , 1986 .

[17]  James D. Miller Auditory‐perceptual interpretation of the vowel , 1987 .

[18]  S. Wood,et al.  The acoustical significance of tongue, lip, and larynx maneuvers in rounded palatal vowels. , 1986, The Journal of the Acoustical Society of America.

[19]  S. Wood A radiographic analysis of constriction locations for vowels , 1979 .

[20]  Frank H. Guenther,et al.  Speech motor control: Acoustic goals, saturation effects, auditory feedback and internal models , 1997, Speech Commun..

[21]  P. Kuhl Perception of auditory equivalence classes for speech in early infancy , 1983 .

[22]  P. Jusczyk From general to language-specific capacities: the WRAPSA Model of how speech perception develops , 1993 .

[23]  Peter F. MacNeilage,et al.  Acquisition of Speech Production: Frames, Then Content , 2018, Attention and Performance XIII.

[24]  Shinji Maeda,et al.  Compensatory Articulation During Speech: Evidence from the Analysis and Synthesis of Vocal-Tract Shapes Using an Articulatory Model , 1990 .

[25]  J H Abbs,et al.  Additional observations on responses to resistive loading of the jaw. , 1976, Journal of speech and hearing research.

[26]  L Saltzman Elliot,et al.  A Dynamical Approach to Gestural Patterning in Speech Production , 1989 .

[27]  G. Edelman Neural Darwinism: The Theory Of Neuronal Group Selection , 1989 .

[28]  Pascal Perrier,et al.  Compensation strategies for the perturbation of the rounded vowel [u] using a lip-tube : A study of the control space in speech production , 1995 .

[29]  P. Kuhl Speech perception in early infancy: perceptual constancy for spectrally dissimilar vowel categories. , 1979, The Journal of the Acoustical Society of America.

[30]  J H Abbs,et al.  Lip and jaw motor control during speech: responses to resistive loading of the jaw. , 1975, Journal of speech and hearing research.

[31]  Raymond D. Kent Psychobiology of speech development: coemergence of language and a movement system. , 1984, The American journal of physiology.

[32]  F H Guenther,et al.  Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. , 1995, Psychological review.

[33]  J. Kelso,et al.  Functionally specific articulatory cooperation following jaw perturbations during speech: evidence for coordinative structures. , 1984, Journal of experimental psychology. Human perception and performance.

[34]  Ursula Gisela Goldstein,et al.  An articulatory model for the vocal tracts of growing children , 1980 .

[35]  F. Guenther,et al.  A theoretical investigation of reference frames for the planning of speech movements. , 1998 .

[36]  E. Thelen Motor development: A new synthesis. , 1995 .

[37]  N. A. Bernshteĭn The co-ordination and regulation of movements , 1967 .

[38]  G. Edelman,et al.  Solving Bernstein's problem: a proposal for the development of coordinated movement by selection. , 1993, Child development.

[39]  Raymond D. Kent 6 – Sensorimotor Aspects of Speech Development , 1981 .

[40]  Jordan R Green Physiologic development of speech motor control: articulatory coordination of lips and jaw , 1998 .

[41]  Gérard Bailly,et al.  Learning to speak. Sensori-motor control of speech movements , 1997, Speech Commun..

[42]  Linda Polka,et al.  Developmental changes in speech perception: new challenges and new directions , 1993 .