Connectionist Approaches to Reading

Reading is a highly complex task, involving the rapid coordination of visual, phonological, semantic and linguistic processes. Computational models have played a key role in the scientific study of reading. These models allow us to explore the implications of specific hypotheses concerning the representations and processes underlying reading acquisition and performance. A particular form of computational modeling, known as connectionist or neural network modeling, offers the further advantage of being explicit about how such mechanisms might be implemented in the brain. In connectionist models, cognitive processes take the form of cooperative and competitive interactions among large numbers of simple neuron-like processing units. Typically, each unit has a real-valued activity level, roughly analogous to the firing rate of a neuron. Unit interactions are governed by weighted connections that encode the long-term knowledge of the system and are learned gradually through experience. Units are often organized into layers or groups; the activity of some groups of units encode the input to the system; the resulting activity of other groups of units encodes the system’s response to that input. For example, one group might encode the written form (orthography) of a word, another might encode its spoken form (phonology), and a third might encode its meaning (semantics; see Figure 1). The patterns of activity of the remaining groups of units—sometimes termed “hidden” units—constitute learned, internal representations that mediate between inputs and outputs. In this way, the connectionist approach attempts capture the essential computational properties of the vast ensembles of real neuronal elements found in the brain using simulations of smaller networks of more abstract units. By linking neural computation to behavior, the framework enables developmental, cognitive and neurobiological issues to be addressed within a single, integrated formalism. One very important advantage of connectionist models is that they deal explicitly with learning. Though many of these models have focussed predominantly on simulating aspects of adult, rather than childrens reading, many of the models do explicitly consider the process of learning (e.g., Plaut, McClelland, Seidenberg & Patterson, 1996; Seidenberg & McClelland, 1989). In essence, such models instantiate learning as a process as a slow incremental increase in knowledge, represented by increasingly strong and accurate connections between different units (e.g., the letters in printed words and the phonemes in spoken words to which they correspond). Another critical feature of many connectionist systems is that after learning they show the ability to generalize (e.g., to pronounce novel words which they have not been trained on). Finally, and related to this, such systems often show graceful degradation when damaged. Removing units or connections in such systems typically does not result in an all-or-none loss of knowledge; rather,

[1]  D. D. Wheeler Processes in word recognition , 1970 .

[2]  M. Page,et al.  Connectionist modelling in psychology: A localist manifesto , 2000, Behavioral and Brain Sciences.

[3]  J. Marshall,et al.  Patterns of paralexia: a psycholinguistic approach. , 1995, Journal of psycholinguistic research.

[4]  M Coltheart,et al.  DRC: a dual route cascaded model of visual word recognition and reading aloud. , 2001, Psychological review.

[5]  M. Coltheart Lexical access in simple reading tasks , 1978 .

[6]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[7]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[8]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[9]  David C. Plaut,et al.  Structure and Function in the Lexical System: Insights from Distributed Models of Word Reading and Lexical Decision , 1997 .

[10]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[11]  Mark S. Seidenberg,et al.  Does Word Identification Proceed From Spelling to Sound to Meaning , 1991 .

[12]  G. C. Orden A ROWS is a ROSE: Spelling, sound, and reading , 1987 .

[13]  D. Plaut Relearning after Damage in Connectionist Networks: Toward a Theory of Rehabilitation , 1996, Brain and Language.

[14]  M. Mozer Letter migration in word perception. , 1983, Journal of experimental psychology. Human perception and performance.

[15]  James L. McClelland Stochastic interactive processes and the effect of context on perception , 1991, Cognitive Psychology.

[16]  Mark S. Seidenberg,et al.  Modeling the Successes and Failures of Interventions for Disabled Readers , 2003 .

[17]  Wayne A. Wickelgran Context-sensitive coding, associative memory, and serial order in (speech) behavior. , 1969 .

[18]  M. Zorzi,et al.  Two routes or one in reading aloud? A connectionist dual-process model. , 1998 .

[19]  Mark S. Seidenberg,et al.  Phonology, reading acquisition, and dyslexia: insights from connectionist models. , 1999, Psychological review.

[20]  Garrison W. Cottrell,et al.  Lexical Ambiguity Resolution: Perspectives from Psycholinguistics, Neuropsychology, and Artificial Intelligence , 1988 .

[21]  Argye E. Hillis,et al.  Category-specific naming and comprehension impairment: a double dissociation , 1998 .

[22]  J. Bullinaria Modeling Reading, Spelling, and Past Tense Learning with Artificial Neural Networks , 1997, Brain and Language.

[23]  James L. McClelland,et al.  The Morton-Massaro law of information integration: implications for models of perception. , 2001, Psychological review.

[24]  B. Ans,et al.  A connectionist multiple-trace memory model for polysyllabic word reading. , 1998, Psychological review.

[25]  James L. McClelland,et al.  Conspiracy effects in word pronunciation. , 1987 .

[26]  Mark S. Seidenberg,et al.  Chapter 5 Beyond Orthographic Depth in Reading: Equitable Division of Labor , 1992 .

[27]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[28]  Ken N. Seergobin,et al.  On the association between connectionism and data: Are a few words necessary? , 1990 .

[29]  E K Warrington,et al.  Concrete word dyslexia. , 1981, British journal of psychology.

[30]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[31]  Randall C. O'Reilly,et al.  Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[32]  S. Andrews Phonological recoding: Is the regularity effect consistent? , 1982 .

[33]  A. H. Kawamoto Nonlinear dynamics in the resolution of lexical ambiguity: A parallel distributed processing account. , 1993 .

[34]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. , 1982, Psychological review.

[35]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[36]  G. M. Reicher Perceptual recognition as a function of meaninfulness of stimulus material. , 1969, Journal of experimental psychology.

[37]  David C. Plaut,et al.  A connectionist approach to word reading and acquired dyslexia: extension to sequential processing , 1999, Cogn. Sci..

[38]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[39]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[40]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[41]  James L. McClelland,et al.  Stipulating versus discovering representations , 2000, Behavioral and Brain Sciences.

[42]  A. Pollatsek,et al.  Automatic access of semantic information by phonological codes in visual word recognition. , 1993, Journal of experimental psychology. Learning, memory, and cognition.

[43]  T. Shallice,et al.  Deep Dyslexia: A Case Study of , 1993 .

[44]  D. Neary,et al.  Deep Dyslexia , 1986 .

[45]  Paul W. B. Atkins,et al.  Models of reading aloud: Dual-route and parallel-distributed-processing approaches. , 1993 .

[46]  R. Burchfield Frequency Analysis of English Usage: Lexicon and Grammar. By W. Nelson Francis and Henry Kučera with the assistance of Andrew W. Mackie. Boston: Houghton Mifflin. 1982. x + 561 , 1985 .

[47]  James L. McClelland,et al.  Nonword pronunciation and models of word recognition. , 1994, Journal of experimental psychology. Human perception and performance.

[48]  Geoffrey E. Hinton,et al.  Lesioning an attractor network: investigations of acquired dyslexia , 1991 .

[49]  M. Mozer,et al.  On the Interaction of Selective Attention and Lexical Knowledge: A Connectionist Account of Neglect Dyslexia , 1990, Journal of Cognitive Neuroscience.

[50]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[51]  M. Coltheart,et al.  Surface dyslexia. , 1983, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[52]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[53]  D. Plaut Double dissociation without modularity: evidence from connectionist neuropsychology. , 1995, Journal of clinical and experimental neuropsychology.

[54]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[55]  Christopher T. Kello,et al.  Locus of the exception effect in naming , 1994 .

[56]  Francis Crick,et al.  The recent excitement about neural networks , 1989, Nature.

[57]  D. Plaut,et al.  A LITERATURE REVIEW AND NEW DATA SUPPORTING AN INTERACTIVE ACCOUNT OF LETTER-BY-LETTER READING. , 1998, Cognitive neuropsychology.

[58]  Mark S. Seidenberg,et al.  Spelling-sound effects in reading: Time-course and decision criteria , 1985, Memory & cognition.

[59]  A. H. Kawamoto Distributed Representations of Ambiguous Words and Their Resolution in a Connectionist Network , 1988 .

[60]  G. Stone,et al.  Word identification in reading and the promise of subsymbolic psycholinguistics. , 1990, Psychological review.

[61]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[62]  Michael C. Mozer,et al.  Perception of multiple objects - a connectionist approach , 1991, Neural network modeling and connectionism.

[63]  Pseudohomophone effects and models of word recognition. , 1996 .

[64]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[65]  J Grainger,et al.  Orthographic processing in visual word recognition: a multiple read-out model. , 1996, Psychological review.

[66]  James L. McClelland,et al.  The role of familiar units in perception of words and nonwords , 1977 .

[67]  John Lyons,et al.  语义学引论 = Linguistic Semantics , 2000 .

[68]  Mark S. Seidenberg,et al.  When does irregular spelling or pronunciation influence word recognition , 1984 .

[69]  Mark S. Seidenberg,et al.  Computing the meanings of words in reading: cooperative division of labor between visual and phonological processes. , 2004, Psychological review.

[70]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[71]  T. Carr,et al.  Perceptual flexibility in word recognition: strategies affect orthographic computation but not lexical access. , 1978, Journal of experimental psychology. Human perception and performance.

[72]  D. Massaro Some criticisms of connectionist models of human performance , 1988 .

[73]  G. C. Orden,et al.  Interdependence of form and function in cognitive systems explains perception of printed words. , 1994, Journal of experimental psychology. Human perception and performance.