Connectionist Modeling of Language: Examples and Implications

Researchers interested in human cognitive processes have long used computer simulations to try to identify the principles of cognition. The strategy has been to build computational models that embody putative principles and then to examine how well such models capture human performance in cognitive tasks. Until the early 1980’s, this effort was undertaken largely within the context of the “computer metaphor” of mind. Researchers built computational models based on the conceptualization that the human mind operated as though it were a conventional digital computer. However, with the advent of so-called connectionist, neural network, or parallel distributed processing models (Anderson, Silverstein, Ritz, & Jones, 1977; Hinton & Anderson, 1981; McClelland & Rumelhart, 1981; McClelland, Rumelhart, & PDP Research Group, 1986; Rumelhart, McClelland, & PDP Research Group, 1986), researchers began exploring the implications of principles that are more broadly consistent with the style of computation employed by the brain. In connectionist models, cognitive processes take the form of cooperative and competitive interactions among large numbers of simple, neuron-like processing units (see Figure 1). Unit interactions are governed by weighted connections that collectively encode the long-term knowledge of the system. The activity of some of the units encodes the input to the system; the resulting activity of other units encodes the system’s response to that input. The patterns of activity of the remaining so-called hidden units constitute learned internal representations that mediate between inputs and outputs. Learning involves modifying the values of connection weights based on feedback from the environment on the accuracy of the system’s responses. While each unit exhibits non-linear spatial and temporal summation, units and connections are not generally considered to be in one-to-one correspondence with actual neurons and synapses. Rather, connectionist systems attempt to capture the essential computational properties of the vast ensembles of real neuronal elements found in the brain, through simulations of smaller networks of units. In this way, the approach is distinct from computational neuroscience (Sejnowski, Koch, & Church

[1]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[2]  G. Dell,et al.  Lexical access in aphasic and nonaphasic speakers. , 1997, Psychological review.

[3]  Julie C. Sedivy,et al.  Eye movements as a window into real-time spoken language comprehension in natural contexts , 1995, Journal of psycholinguistic research.

[4]  W. Marslen-Wilson,et al.  The temporal structure of spoken language understanding , 1980, Cognition.

[5]  Geoffrey E. Hinton,et al.  Parallel Models of Associative Memory , 1989 .

[6]  Geoffrey E. Hinton,et al.  Lesioning an attractor network: investigations of acquired dyslexia , 1991 .

[7]  J. Locke,et al.  Phonological acquisition and change , 1983 .

[8]  James L. McClelland,et al.  Can a perceptual processing deficit explain the impairment of inflectional morphology in developmental dysphasia? A computational investigation. , 1993 .

[9]  Marilyn M. Vihman,et al.  Phonological Development , 2014 .

[10]  J. Jaeger A POSITRON EMISSION TOMOGRAPHIC STUDY OF REGULAR AND IRREGULAR VERB MORPHOLOGY IN ENGLISH , 1996 .

[11]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[12]  Edmund T. Rolls,et al.  Introduction to Connectionist Modelling of Cognitive Processes , 1998 .

[13]  V. Marchman,et al.  U-shaped learning and frequency effects in a multi-layered perception: Implications for child language acquisition , 1991, Cognition.

[14]  James Hoeffner,et al.  Are Rules a Thing of the Past?: The Acquisition of Verbal Morphology by an Attractor Network. , 1992 .

[15]  Mark S. Seidenberg,et al.  Evaluating behavioral and neuroimaging data on past tense processing , 1998, Language.

[16]  Noam Chomsky,et al.  The Minimalist Program , 1992 .

[17]  Garrison W. Cottrell,et al.  Acquiring the Mapping from Meaning to Sounds , 1994, Connect. Sci..

[18]  Eric L. Schwartz,et al.  Computational Neuroscience , 1993, Neuromethods.

[19]  Mark S. Seidenberg,et al.  Phonology, reading acquisition, and dyslexia: insights from connectionist models. , 1999, Psychological review.

[20]  V. Marchman,et al.  From rote learning to system building: acquiring verb morphology in children and connectionist nets , 1993, Cognition.

[21]  Tim van Gelder,et al.  Compositionality: A Connectionist Variation on a Classical Theme , 1990, Cogn. Sci..

[22]  Michael I. Jordan,et al.  Goal-based speech motor control: A theoretical framework and some preliminary data , 1995 .

[23]  Michael I. Jordan,et al.  Sensorimotor adaptation in speech production. , 1998, Science.

[24]  Mark S. Seidenberg,et al.  Rules or connections? The past tense revisited , 1992 .

[25]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[26]  Myrna F. Schwartz,et al.  Modular deficits in Alzheimer-type dementia , 1990 .

[27]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[28]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[29]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[30]  D. Ingram Phonological Disability in Children , 1976 .

[31]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[32]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[33]  S. Crain Language acquisition in the absence of experience , 1991, Behavioral and Brain Sciences.

[34]  S. Pinker,et al.  Connections and symbols , 1988 .

[35]  Merrill F. Garrett,et al.  Sentence processing , 1990 .

[36]  Randall C. O'Reilly,et al.  Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[37]  H. Benedict,et al.  Early lexical development: comprehension and production , 1979, Journal of Child Language.

[38]  K. Patterson Phonological ALEXIA or PHONOLOGICAL alexia , 1995 .

[39]  Peter F. MacNeilage,et al.  Acquisition of Speech Production: The Achievement of Segmental Independence , 1990 .

[40]  Lyn Frazier,et al.  Theories of sentence processing , 1987 .

[41]  Kim S. Graham,et al.  The relationship between comprehension and oral reading in progressive fluent aphasia , 1995 .

[42]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[43]  M. Zorzi,et al.  Two routes or one in reading aloud? A connectionist dual-process model. , 1998 .

[44]  James L. McClelland,et al.  Sentence comprehension: A parallel distributed processing approach , 1989, Language and Cognitive Processes.

[45]  Karalyn Patterson Deterioration of word meaning: implications for reading , 1995 .

[46]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[47]  Michael Studdert-Kennedy,et al.  Discovering phonetic function , 1993 .

[48]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[49]  M. Coltheart,et al.  Surface dyslexia. , 1983, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[50]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[51]  D. Plaut Double dissociation without modularity: evidence from connectionist neuropsychology. , 1995, Journal of clinical and experimental neuropsychology.

[52]  J. Elman Distributed Representations, Simple Recurrent Networks, And Grammatical Structure , 1991 .

[53]  Douglas L. T. Rohde,et al.  Language acquisition in the absence of explicit negative evidence: how important is starting small? , 1999, Cognition.

[54]  Geoffrey E. Hinton Learning and Applying Contextual Constraints in Sentence Comprehension , 1991 .

[55]  A. Marchal,et al.  Speech production and speech modelling , 1990 .

[56]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[57]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[58]  Mark S. Seidenberg,et al.  Language Acquisition and Use: Learning and Applying Probabilistic Constraints , 1997, Science.

[59]  S. Pinker The Language Instinct , 1994 .

[60]  W. Kintsch,et al.  Strategies of discourse comprehension , 1983 .

[61]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[62]  J. Elman,et al.  Learning and morphological change , 1995, Cognition.

[63]  S Pinker,et al.  Rules of language. , 1991, Science.

[64]  Maryellen C. MacDonald,et al.  The lexical nature of syntactic ambiguity resolution , 1994 .

[65]  M. Raijmakers Rethinking innateness: A connectionist perspective on development. , 1997 .

[66]  Mitchell P. Marcus,et al.  A theory of syntactic recognition for natural language , 1979 .

[67]  Brian MacWhinney,et al.  The emergence of language. , 1999 .

[68]  Gregg C. Oden,et al.  Semantic constraints and judged preference for interpretations of ambiguous sentences , 1978 .

[69]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[70]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[71]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[72]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[73]  Alvin M. Liberman,et al.  Speech: A Special Code , 1996 .

[74]  A. Meltzoff,et al.  Imitation of Facial and Manual Gestures by Human Neonates , 1977, Science.

[75]  John L. Locke Development of the Capacity for Spoken Language , 2019, The Handbook of Child Language.

[76]  D. Holender,et al.  Analytic approaches to human cognition , 1992 .

[77]  K R Wagner,et al.  How much do children say in a day? , 1985, Journal of Child Language.

[78]  V. Marchman Constraints on Plasticity in a Connectionist Model of the English Past Tense , 1993, Journal of Cognitive Neuroscience.

[79]  V. Marchman,et al.  Learning from a connectionist model of the acquisition of the English past tense , 1996, Cognition.

[80]  J. Hodges,et al.  The relationship between comprehension and oral reading in progressive fluent aphasia , 1994, Neuropsychologia.

[81]  Joan L. Bybee,et al.  Rules and schemas in the development and use of the English past tense , 1982 .

[82]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[83]  James L. McClelland,et al.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .

[84]  B. MacWhinney,et al.  Implementations are not conceptualizations: Revising the verb learning model , 1991, Cognition.

[85]  S. Pinker,et al.  A Neural Dissociation within Language: Evidence that the Mental Dictionary Is Part of Declarative Memory, and that Grammatical Rules Are Processed by the Procedural System , 1997, Journal of Cognitive Neuroscience.

[86]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[87]  Murray Grossman,et al.  Sentence comprehension in Parkinson's disease: The role of attention and memory , 1992, Brain and Language.

[88]  R. Ratcliff,et al.  The comprehension processes and memory structures involved in instrumental inference , 1981 .

[89]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[90]  Max Coltheart Phonological dyslexia : a special issue of cognitive neuropsychology , 1996 .

[91]  Stephen A. Ritz,et al.  Distinctive features, categorical perception, and probability learning: some applications of a neural model , 1977 .

[92]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[93]  T. Bever,et al.  The relation between linguistic structure and associative theories of language learning—A constructive critique of some connectionist learning models , 1988, Cognition.

[94]  William D. Marslen-Wilson,et al.  Dissociating types of mental computation , 1997, Nature.

[95]  P. Jusczyk The discovery of spoken language , 1997 .

[96]  Richard C. Anderson,et al.  On putting apples into bottles — A problem of polysemy , 1975, Cognitive Psychology.

[97]  R. Hans Phaf,et al.  Connectionism and psychology: A psychological perspective on new connectionist research: Philip Quinlan, Harvester Wheatsheaf, New York, 1991. ISBN 0-7450- 0834-8 , 1994 .

[98]  C. Clifton,et al.  The independence of syntactic processing , 1986 .