The Role of Input Size and Generativity in Simulating Language Acquisition

This paper presents an analysis of the role of input size and generativity (ability to produce novel utterances) in simulating developmental data on a phenomenon in first language acquisition. An existing model that has already simulated the basic phenomenon is trained on input sets of varying sizes (13,000 to 40,000 utterances). The ability of the model to produce novel utterances is also manipulated. Both input size and generativity affect the fits for later stages of development. Higher generativity improves fits for later stages, but worsens them for early stages, suggesting generativity is best increased as a function of mean length of utterance (MLU). The effect of training set is variable. Results are discussed in terms of optimal training sets for simulations, and children’s developing ability to produce utterances beyond the input they have heard.

[1]  Morten H. Christiansen,et al.  Connectionist psycholinguistics: capturing the empirical data , 2001, Trends in Cognitive Sciences.

[2]  Julian M. Pine,et al.  A process model of children's early verb use , 2000 .

[3]  Fernand Gobet,et al.  Modelling optional infinitive phenomena: A computational account of tense optionality in children’s speech , 2000 .

[4]  S. Gillis,et al.  Root infinitives in Dutch early child language: an effect of input? , 2001, Journal of Child Language.

[5]  Fernand Gobet,et al.  Subject Omission in Children’s Language: The Case for Performance Limitations in Learning , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[6]  Nick Chater,et al.  Distributional Information: A Powerful Cue for Acquiring Syntactic Categories , 1998, Cogn. Sci..

[7]  Fernand Gobet,et al.  Modelling the Development of Dutch Optional Infinitives in MOSAIC , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[8]  M. Tomasello Do young children have adult syntactic competence? , 2000, Cognition.

[9]  Fernand Gobet,et al.  Modeling the optional infinite stage in MOSAIC: A generalization to Dutch , 2001 .

[10]  T. A. Cartwright,et al.  Syntactic categorization in early language acquisition: formalizing the role of distributional analysis , 1997, Cognition.

[11]  B. Richards Type/Token Ratios: what do they really tell us? , 1987, Journal of Child Language.

[12]  Kenneth Wexler Optional infinitives, head movement and the economy of derivation in child grammar , 1992 .

[13]  B. MacWhinney,et al.  The Child Language Data Exchange System: an update , 1990, Journal of Child Language.

[14]  E. Bates,et al.  Continuity in lexical and morphological development: a test of the critical mass hypothesis , 1994, Journal of Child Language.

[15]  Fernand Gobet,et al.  Modeling children’s case marking errors with MOSAIC , 2001 .

[16]  L. Gerken,et al.  Grammatical and caregiver cues in early sentence comprehension , 1999, Journal of Child Language.