Speech Perception and Production by a Self-Organizing Neural Network.

Abstract : Considerations of the real-time self-organization of neural networks for speech recognition and production have lead to a new understanding of several key issues in such networks, most notably a definition of new processing units and functions of hierarchical levels in the auditory system. An important function of a particular neural level in the auditory system is to provide a partially-compressed code, mapped to the articulatory system, to permit imitation of novel sounds. Furthermore, top-down priming signals from the articulatory system to the auditory system help to stabilize the emerging auditory code. These structures help explain results from the motor theory, which states the speech is analyzed by how it would be produced. Higher stages of processing require chunking or unitization of the emerging language code, an example of a classical grouping problem. The partially compressed auditory codes are further compressed into item codes (e.g., phonemic segments), which are stored in a working memory representation whose short-term memory pattern is its code. A masking field level receives input from this working memory and encodes this input into list chunks, whose top-down signals organize the items in working memory into coherent groupings with invariant properties. This total architecture sheds new light on key speech issues such as coarticulation, analysis-by-synthesis, motor theory, categorical perception, invariant speech perception, word superiority, and phonemic restoration.