Representation , Learning , Generalization and Damage in Neural Network Models of Reading Aloud

We present a new class of neural network models of reading aloud based on Sejnowski & Rosenberg’s NETtalk. Unlike previous models, they are not restricted to mono-syllabic words, require no complicated inputoutput representations such as Wickelfeatures and require no preprocessing to align the letters and phonemes in the training data. The best cases are able to achieve perfect performance on the Seidenberg & McClelland training corpus (which includes many irregular words) and in excess of 95% on a standard set of pronounceable non-words. Evidence is presented that relate the output activation error scores in the model to naming latencies in humans. Several possible accounts of developmental surface dyslexia are identified and on various forms of damage the models exhibit symptoms similar to acquired surface dyslexia. However, their inability to account for lexical decision, the pseudohomophone effect and phonological dyslexia indicate that we will still need to introduce an additional lexical/semantic route before we have a complete model of reading aloud. Nevertheless, the models’ simplicity, performance and room for improvement make them a promising basis for the graphemephoneme conversion route of a realistic dual route model of reading. Edinburgh University Technical Report 94/1 – November 1994

[1]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[2]  Wayne A. Wickelgran Context-sensitive coding, associative memory, and serial order in (speech) behavior. , 1969 .

[3]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[4]  J. Marshall,et al.  Patterns of paralexia: a psycholinguistic approach. , 1995, Journal of psycholinguistic research.

[5]  K. Forster,et al.  Lexical Access and Naming Time. , 1973 .

[6]  J. Baron,et al.  Use of orthographic and word-specific knowledge in reading words aloud. , 1976 .

[7]  J. Frederiksen,et al.  Spelling and sound: Approaches to the internal lexicon. , 1976 .

[8]  James L. McClelland On the time relations of mental processes: An examination of systems of processes in cascade. , 1979 .

[9]  R. Glushko The Organization and Activation of Orthographic Knowledge in Reading Aloud. , 1979 .

[10]  U. Frith Cognitive Processes in Spelling , 1980 .

[11]  L. Henderson Orthography and Word Recognition in Reading , 1982 .

[12]  S. Andrews Phonological recoding: Is the regularity effect consistent? , 1982 .

[13]  E. Funnell Phonological processes in reading: new evidence from acquired dyslexia. , 1983, British journal of psychology.

[14]  Robert L. Mercer,et al.  An information theoretic approach to the automatic determination of phonemic baseforms , 1984, ICASSP.

[15]  Mark S. Seidenberg,et al.  When does irregular spelling or pronunciation influence word recognition , 1984 .

[16]  G. Humphreys,et al.  Are there independent lexical and nonlexical routes in word processing? An evaluation of the dual-route theory of reading , 1985, Behavioral and Brain Sciences.

[17]  Mark S. Seidenberg,et al.  Spelling-sound effects in reading: Time-course and decision criteria , 1985, Memory & cognition.

[18]  N. Geschwind,et al.  Mechanisms of Change after Brain Lesions a , 1985, Annals of the New York Academy of Sciences.

[19]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[20]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[21]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[22]  E. Warrington,et al.  Phonological Reading: Phenomena and Paradoxes , 1986, Cortex.

[23]  D. Besner,et al.  Reading pseudohomophones: Implications for models of pronunciation assembly and the locus of word-frequency effects in naming. , 1987 .

[24]  Geoffrey E. Hinton Learning Translation Invariant Recognition in Massively Parallel Networks , 1987, PARLE.

[25]  James L. McClelland,et al.  Conspiracy effects in word pronunciation. , 1987 .

[26]  Gordon D. A. Brown Resclving inconsistency: A computational model of word naming , 1987 .

[27]  D. Baxter Surface Dyslexia: Neuropsychological and Cognitive Studies of Phonological Reading , 1987 .

[28]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[29]  Derek Besner,et al.  Word recognition and identification: Do word-frequency effects reflect lexical access? , 1988 .

[30]  K. Kirsner,et al.  Discovering functionally independent mental processes: the principle of reversed association. , 1988, Psychological review.

[31]  R. Treiman,et al.  Units in reading and spelling , 1988 .

[32]  Veronika Coltheart,et al.  Phonological Recoding in Reading for Meaning by Adults and Children , 1988 .

[33]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[34]  G. C. Orden,et al.  Word identification in reading proceeds from spelling to sound to meaning. , 1988, Journal of experimental psychology. Learning, memory, and cognition.

[35]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[36]  Etienne Deprit Implementing recurrent back-propagation on the connection machine , 1989, Neural Networks.

[37]  Hervé Bourlard,et al.  Generalization and Parameter Estimation in Feedforward Netws: Some Experiments , 1989, NIPS.

[38]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[39]  Richard J. Brown Neuropsychology Mental Structure , 1989 .

[40]  James L. McClelland,et al.  Connections and disconnections: Acquired dyslexia in a computational model of reading processes. , 1989 .

[41]  Michael C. Mozer,et al.  Using Relevance to Reduce Network Size Automatically , 1989 .

[42]  James L. McClelland,et al.  More Words but Still No Lexicon : Reply to Besner et al . ( 1990 ) , 1990 .

[43]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[44]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[45]  Philip B. Gough,et al.  Two ideas about spelling: Rules and word-specific memory , 1990 .

[46]  Ehud D. Karnin,et al.  A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.

[47]  Mark S. Seidenberg,et al.  The basis of consistency effects in word naming , 1990 .

[48]  Michael I. Jordan Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .

[49]  Robert I. Damper,et al.  Novel-word pronunciation within a text-to-speech system , 1990, SSW.

[50]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[51]  James L. McClelland,et al.  Learning the structure of event sequences. , 1991, Journal of experimental psychology. General.

[52]  Geoffrey E. Hinton,et al.  Lesioning an attractor network: investigations of acquired dyslexia. , 1991, Psychological review.

[53]  James L. McClelland,et al.  READING EXCEPTION WORDS AND PSEUDOWORDS - ARE 2 ROUTES REALLY NECESSARY , 1992 .

[54]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[55]  T. Shallice,et al.  Deep Dyslexia: A Case Study of , 1993 .

[56]  James L. McClelland,et al.  Generalization with Componential Attractors: Word and Nonword Reading in an Attractor Network , 1993 .

[57]  Paul W. B. Atkins,et al.  Models of reading aloud: Dual-route and parallel-distributed-processing approaches. , 1993 .

[58]  Ja Bullinaria,et al.  DOUBLE DISSOCIATION IN ARTIFICIAL NEURAL NETWORKS - IMPLICATIONS FOR NEUROPSYCHOLOGY , 1993 .

[59]  T. Shallice,et al.  Reading without Semantics , 1983 .

[60]  Nick Chater,et al.  Connectionist modelling: Implications for cognitive neuropsychology , 1995 .