Capturing Gradience, Continuous Change, and Quasi‐Regularity in Sound, Word, Phrase, and Meaning

One vision of the nature of language holds that a language consists of a set of symbolic unit types, and a set of units of each type, together with a set of grammatical principles that constrain how these units can be used to compose other units, and a system of rules that project structured arrangements of such units onto other structured arrangements of units (for example, from syntactic to semantic structure). An alternative vision of the nature of language holds that it is often useful to characterize language as if the above statements were true, but only as a way of approximately notating or summarizing aspects of language. In reality, according to this alternative vision, approximate conformity to structured systems of symbolic units and rules arises historically, developmentally, and in the moment, from the processes that operate as users communicate with each other using sound or gesture as their medium of communication. These acts of communication leave residues that can be thought of as storing knowledge in the form of the continuous-valued parameters of a complex dynamical system (i.e. a system characterized by continuous, stochastic, and non-linear differential equations). Greatly influenced by the work of Joan Bybee (1985, 2001) and others who have pointed out some of its advantages, I am a disciple of this alternative vision (Bybee and McClelland, 2005; McClelland and Bybee, 2007). As argued in the Bybee and McClelland papers just cited, neural network models that rely on distributed representations (sometimes called connectionist or parallel-distributed processing models) provide one useful way of capturing features of this vision. Such models are, in general, just the sort of continuous, stochastic, non-linear systems that are needed to capture the key phenomena, and the connection weights and other variables in such networks are the continuous-valued parameters in which

[1]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[2]  B. Milner Amnesia following operation on the temporal lobes , 1996 .

[3]  D Marr,et al.  Simple memory: a theory for archicortex. , 1971, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  R. Brown,et al.  A First Language , 1973 .

[5]  Mark Aronoff,et al.  Word Formation in Generative Grammar , 1979 .

[6]  Morris Halle,et al.  The rules of language , 1980, IEEE Transactions on Professional Communication.

[7]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[8]  R. Nosofsky American Psychological Association, Inc. Choice, Similarity, and the Context Theory of Classification , 2022 .

[9]  Joan L. Bybee Morphology: A study of the relation between meaning and form , 1985 .

[10]  James L. McClelland,et al.  Distributed memory and the representation of general and specific information. , 1985, Journal of experimental psychology. General.

[11]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[12]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[13]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[14]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[15]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[16]  T. Bever,et al.  The relation between linguistic structure and associative theories of language learning—A constructive critique of some connectionist learning models , 1988, Cognition.

[17]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: part 1.: an account of basic findings , 1988 .

[18]  R. Taraban,et al.  Language learning: Cues or rules? , 1989 .

[19]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[20]  James L. McClelland,et al.  Sentence comprehension: A parallel distributed processing approach , 1989, Language and Cognitive Processes.

[21]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[22]  James L. McClelland,et al.  Learning and Applying Contextual Constraints in Sentence Comprehension , 1990, Artif. Intell..

[23]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[24]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[25]  David E. Rumelhart,et al.  Brain style computation: learning and generalization , 1990 .

[26]  V. Marchman,et al.  U-shaped learning and frequency effects in a multi-layered perception: Implications for child language acquisition , 1991, Cognition.

[27]  Risto Miikkulainen,et al.  Natural Language Processing With Modular PDP Networks and Distributed Lexicon , 1991, Cogn. Sci..

[28]  B. MacWhinney,et al.  Implementations are not conceptualizations: Revising the verb learning model , 1991, Cognition.

[29]  L. Squire Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. , 1992, Psychological review.

[30]  J. Kruschke,et al.  ALCOVE: an exemplar-based connectionist model of category learning. , 1992, Psychological review.

[31]  J. Shonkoff,et al.  Development of infants with disabilities and their families: implications for theory and service delivery. , 1992, Monographs of the Society for Research in Child Development.

[32]  S Pinker,et al.  Overregularization in language acquisition. , 1992, Monographs of the Society for Research in Child Development.

[33]  James L. McClelland,et al.  Can a perceptual processing deficit explain the impairment of inflectional morphology in developmental dysphasia? A computational investigation. , 1993 .

[34]  Peter M. Todd,et al.  Learning and connectionist representations , 1993 .

[35]  James L. McClelland Toward a theory of information processing in graded, random, and interactive networks , 1993 .

[36]  Paul W. B. Atkins,et al.  Models of reading aloud: Dual-route and parallel-distributed-processing approaches. , 1993 .

[37]  V. Marchman,et al.  From rote learning to system building: acquiring verb morphology in children and connectionist nets , 1993, Cognition.

[38]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[39]  J. Elman,et al.  Learning and morphological change , 1995, Cognition.

[40]  Gary F. Marcus,et al.  German Inflection: The Exception That Proves the Rule , 1995, Cognitive Psychology.

[41]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[42]  James L. McClelland,et al.  Considerations arising from a complementary learning systems perspective on hippocampus and neocortex , 1996, Hippocampus.

[43]  Steven Pinker,et al.  Words and rules , 1998 .

[44]  J. Zwart The Minimalist Program , 1998, Journal of Linguistics.

[45]  Don H. Johnson,et al.  Toward a theory of information processing , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[46]  David C. Plaut,et al.  Are non-semantic morphological effects incompatible with a distributed connectionist approach to lexical processing? , 2000 .

[47]  Janet B. Pierrehumbert,et al.  Exemplar dynamics: Word frequency, lenition and contrast , 2000 .

[48]  Bobby D. Bryant and Risto Miikkulainen From Word Stream To Gestalt: A Direct Semantic Parse For Complex Sentences , 2001 .

[49]  David C. Plaut,et al.  A connectionist model of sentence comprehension and production , 2002 .

[50]  James L. McClelland,et al.  ‘Words or Rules’ cannot exploit the regularity in exceptions , 2002, Trends in Cognitive Sciences.

[51]  Luigi Burzio Missing players: Phonology and the past-tense debate , 2002 .

[52]  James L. McClelland,et al.  Rules or connections in past-tense inflections: what does the evidence rule out? , 2002, Trends in Cognitive Sciences.

[53]  Anne R. Schutte,et al.  Testing the dynamic field theory: working memory for locations becomes more spatially precise over development. , 2003, Child development.

[54]  F. Newmeyer On Nature and Language, and: The Language Organ: Linguistics as Cognitive Physiology, and: Language in a Darwinian Perspective (review) , 2003 .

[55]  Gary Lupyan,et al.  Did, Made, Had, Said: Capturing Quasi-Regularity in Exception , 2003 .

[56]  James L. McClelland,et al.  Structure and deterioration of semantic memory: a neuropsychological and computational investigation. , 2004, Psychological review.

[57]  James L. McClelland,et al.  Semantic Cognition: A Parallel Distributed Processing Approach , 2004 .

[58]  Grover Hudson,et al.  PHONOLOGY AND LANGUAGE USE , 2004 .

[59]  Morten H. Christiansen,et al.  Uncovering the Richness of the Stimulus: Structure Dependence and Indirect Statistical Evidence , 2005, Cogn. Sci..

[60]  J. Elman Distributed representations, simple recurrent networks, and grammatical structure , 1991, Machine Learning.

[61]  James L. McClelland,et al.  Alternatives to the combinatorial paradigm of linguistic theory based on domain general principles of human cognition , 2005 .

[62]  Joan L. Bybee,et al.  From Usage to Grammar: The Mind's Response to Repetition , 2007 .

[63]  Elizabeth Jefferies,et al.  Presemantic Cognition in Semantic Dementia: Six Deficits in Search of an Explanation , 2006, Journal of Cognitive Neuroscience.

[64]  G. Dell,et al.  Becoming syntactic. , 2006, Psychological review.

[65]  James L. McClelland,et al.  How Language Affects Thought in a Connectionist Model , 2007 .

[66]  Mark S. Seidenberg,et al.  Graded semantic and phonological similarity effects in priming: evidence for a distributed connectionist approach to morphology. , 2007, Journal of experimental psychology. General.

[67]  R. Jackendoff Linguistics in Cognitive Science: The state of the art , 2007 .

[68]  Joan L. Bybee,et al.  Gradience of Gradience: A reply to Jackendoff , 2007 .

[69]  James L. McClelland,et al.  A single-system account of semantic and lexical deficits in five semantic dementia patients , 2008, Cognitive neuropsychology.

[70]  James L. McClelland,et al.  Toward a Unified Theory of Development: Connectionism and Dynamic Systems Theory Re-Considered , 2009 .

[71]  James L. McClelland,et al.  A connectionist model of a continuous developmental transition in the balance scale task , 2009, Cognition.

[72]  James L. McClelland,et al.  Semantic Cognition : Its Nature , Its Development , and Its Neural Basis , 2008 .

[73]  James L. McClelland The Place of Modeling in Cognitive Science , 2009, Top. Cogn. Sci..

[74]  James L. McClelland,et al.  Connectionist Models of Development: Mechanistic Dynamical Models with Emergent Dynamical Properties , 2009 .

[75]  James L. McClelland,et al.  Dynamical and connectionist approaches to development: toward a future of mutually beneficial co-evolution , 2009 .

[76]  J. Tenenbaum,et al.  Structured statistical models of inductive reasoning. , 2009, Psychological review.

[77]  D. Plaut,et al.  Locating Object Knowledge in the Brain , 2022 .

[78]  Geoffrey E. Hinton,et al.  Deep Belief Networks for phone recognition , 2009 .

[79]  J. Tenenbaum,et al.  Probabilistic models of cognition: exploring representations and inductive biases , 2010, Trends in Cognitive Sciences.

[80]  James L. McClelland,et al.  Locating object knowledge in the brain: comment on Bowers's (2009) attempt to revive the grandmother cell hypothesis. , 2010, Psychological review.

[81]  J. Tenenbaum,et al.  The learnability of abstract syntactic principles , 2011, Cognition.

[82]  Noam Chomsky,et al.  Poverty of the Stimulus Revisited , 2011, Cogn. Sci..

[83]  James L. McClelland,et al.  Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[84]  James L. McClelland,et al.  Learning hierarchical category structure in deep neural networks , 2013 .

[85]  M. Tanenhaus Afterword The impact of “The cognitive basis for linguistic structures” , 2013 .

[86]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[87]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[88]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[89]  James L. McClelland Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory. , 2013, Journal of experimental psychology. General.

[90]  David C. Plaut,et al.  Quasiregularity and Its Discontents: The Legacy of the Past Tense Debate , 2014, Cogn. Sci..

[91]  Franziska Frankfurter,et al.  Constructions: A construction grammar approach to argument structure: Adele E. Goldberg, Chicago, IL: The University of Chicago Press, 1995. xi + 265 pp , 1998 .

[92]  M. Harm Building Large Scale Distributed Semantic Feature Sets with WordNet , 2022 .