Learnability, representation, and language: a bayesian approach

Within the metaphor of the “mind as a computation device” that dominates cognitive science, understanding human cognition means understanding learnability—not only what (and how) the brain learns, but also what data is available to it from the world. Ideal learnability arguments seek to characterize what knowledge is in theory possible for an ideal reasoner to acquire, which illuminates the path towards understanding what human reasoners actually do acquire. The goal of this thesis is to exploit recent advances in machine learning to revisit three common learnability arguments in language acquisition. By formalizing them in Bayesian terms and evaluating them given realistic, real-world datasets, we achieve insight about what must be assumed about a child's representational capacity, learning mechanism, and cognitive biases. Exploring learnability in the context of an ideal learner but realistic (rather than ideal) datasets enables us to investigate what could be learned in practice rather than noting what is impossible in theory. Understanding how higher-order inductive constraints can themselves be learned permits us to reconsider inferences about innate inductive constraints in a new light. And realizing how a learner who evaluates theories based on a simplicity/goodness-of-fit tradeoff can handle sparse evidence may lead to a new perspective on how humans reason based on the noisy and impoverished data in the world. The learnability arguments I consider all ultimately stem from the impoverishment of the input—either because it lacks negative evidence, it lacks a certain essential kind of positive evidence, or it lacks sufficient quantity of evidence necessary for choosing from an infinite set of possible generalizations. I focus on these learnability arguments in the context of three major topics in language acquisition: the acquisition of abstract linguistic knowledge about hierarchical phrase structure, the acquisition of verb argument structures, and the acquisition of word learning biases. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  M. Bowerman Mapping thematic roles onto syntactic functions: are children helped by innate linking rules? , 1990 .

[3]  Thomas L. Griffiths,et al.  A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments , 2006, NIPS.

[4]  Ken-ichi Funahashi,et al.  Multilayer neural networks and Bayes decision theory , 1998, Neural Networks.

[5]  Ellen M. Markman,et al.  Word Learning in Children: An Examination of Fast Mapping. , 1987 .

[6]  P. Bloom How children learn the meanings of words , 2000 .

[7]  Carla L. Hudson Kam,et al.  Regularizing Unpredictable Variation: The Roles of Adult and Child Learners in Language Formation and Change , 2005 .

[8]  Morten H. Christiansen,et al.  Structure Dependence in Language Acquisition: Uncovering the Statistical Richness of the Stimulus , 2004 .

[9]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[10]  N. Chater,et al.  Proceedings of the fourteenth annual conference of the cognitive science society , 1992 .

[11]  Robert C. Berwick,et al.  Locality principles and the acquisition of syntactic knowledge , 1982 .

[12]  Joshua B. Tenenbaum,et al.  Learning annotated hierarchies from relational data , 2006, NIPS.

[13]  Peter Urbach,et al.  Scientific Reasoning: The Bayesian Approach , 1989 .

[14]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[15]  James L. Morgan,et al.  Signal to syntax : bootstrapping from speech to grammar in early acquisition , 1996 .

[16]  Barbara C. Scholz,et al.  Empirical assessment of stimulus poverty arguments , 2002 .

[17]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[18]  Rajesh P. N. Rao,et al.  Goal-Based Imitation as Probabilistic Inference over Graphical Models , 2005, NIPS.

[19]  Letitia R. Naigles,et al.  Children use syntax to learn verb meanings , 1990, Journal of Child Language.

[20]  M. Gazzaniga,et al.  Cognitive Neuroscience: The Biology of the Mind , 1998 .

[21]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[22]  E. Conwell,et al.  Early syntactic productivity: Evidence from dative shift , 2007, Cognition.

[23]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[24]  D. Everett Cultural Constraints on Grammar and Cognition in Pirahã , 2005, Current Anthropology.

[25]  R. Treiman,et al.  Brown & Hanlon revisited: mothers' sensitivity to ungrammatical forms , 1984, Journal of Child Language.

[26]  Ellen M. Markman,et al.  Constraints Children Place on Word Meanings , 1990, Cogn. Sci..

[27]  Noam Chomsky,et al.  The faculty of language: what is it, who has it, and how did it evolve? , 2002 .

[28]  Rajesh P. N. Rao,et al.  Probabilistic Models of the Brain: Perception and Neural Function , 2002 .

[29]  S. Pinker Learnability and Cognition: The Acquisition of Argument Structure , 1989 .

[30]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[31]  S. Waxman,et al.  Conceptual information permeates word learning in infancy. , 2005, Developmental psychology.

[32]  Donald Mitchell,et al.  Lexical guidance in human parsing: Locus and processing characteristics. , 1987 .

[33]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[34]  Marc Light,et al.  Statistical models for the induction and use of selectional preferences , 2002, Cogn. Sci..

[35]  Vikash K. Mansinghka,et al.  Learning Cross-cutting Systems of Categories , 2006 .

[36]  M. Tomasello,et al.  How Children Constrain Their Argument Structure Constructions. , 1999 .

[37]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[38]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[39]  E. Spelke,et al.  Ontological categories guide young children's inductions of word meaning: Object terms and substance terms , 1991, Cognition.

[40]  Afra Alishahi,et al.  A Probabilistic Model of Early Argument Structure Acquisition , 2005 .

[41]  J. Elman,et al.  Rethinking Innateness: A Connectionist Perspective on Development , 1996 .

[42]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[43]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[44]  Michelle A. Hollander,et al.  The learnability and acquisition of the dative alternation , 1989 .

[45]  Leslie G. Valiant,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994, JACM.

[46]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[47]  S. Waxman,et al.  Déjà Vu all over again: re-revisiting the conceptual status of early word learning: comment on Smith and Samuelson (2006). , 2006, Developmental psychology.

[48]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[49]  L. Steels,et al.  Social dynamics: Emergence of language , 2007 .

[50]  Stefan Riezler,et al.  Statistical models of syntax learning and use , 2002 .

[51]  Eliana Colunga,et al.  Dumb mechanisms make smart concepts , 2004 .

[52]  Nigel H. Goddard,et al.  Proceedings of the 15th Annual Conference of the Cognitive Science Society , 1993 .

[53]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[54]  S. Waxman,et al.  Do words facilitate object categorization in 9-month-old infants? , 1997, Journal of experimental child psychology.

[55]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[56]  Nancy Chang,et al.  Putting Meaning into Grammar Learning , 2004, Workshop On Psycho-Computational Models Of Human Language Acquisition.

[57]  M. Saxton The Contrast Theory of negative input , 1997, Journal of Child Language.

[58]  R. Baillargeon,et al.  2.5-Month-Old Infants' Reasoning about When Objects Should and Should Not Be Occluded , 1999, Cognitive Psychology.

[59]  Vladimir M. Sloutsky,et al.  The Role of Words and Sounds in Infants' Visual Processing: From Overshadowing to Attentional Tuning , 2008, Cogn. Sci..

[60]  E. Spelke,et al.  Infants' knowledge of object motion and human action. , 1995 .

[61]  Fei Xu The role of language in acquiring object kind concepts in infancy , 2002, Cognition.

[62]  C. Fisher Structural Limits on Verb Mapping: The Role of Analogy in Children's Interpretations of Sentences , 1996, Cognitive Psychology.

[63]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[64]  Christopher D. Manning,et al.  Probabilistic models of language processing and acquisition , 2006, Trends in Cognitive Sciences.

[65]  H. Jeffreys,et al.  Theory of probability , 1896 .

[66]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[67]  Rajesh P. N. Rao Neural Models of Bayesian Belief Propagation , 2006 .

[68]  Linda B. Smith,et al.  Object name Learning Provides On-the-Job Training for Attention , 2002, Psychological science.

[69]  J. Tenenbaum,et al.  Learning Domain Structures , 2004 .

[70]  J. Elman,et al.  Learnability and the Statistical Structure of Language: Poverty of Stimulus Arguments Revisited , 2004 .

[71]  Joshua B. Tenenbaum,et al.  Inferring causal networks from observations and interventions , 2003, Cogn. Sci..

[72]  H. Gleitman,et al.  Human simulations of vocabulary learning , 1999, Cognition.

[73]  Johanna D. Moore,et al.  Proceedings of the 28th Annual Conference of the Cognitive Science Society , 2005 .

[74]  Georgia M. Green,et al.  Semantics and Syntactic Regularity , 1974 .

[75]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[77]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[78]  Daniel N. Osherson,et al.  Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists , 1990 .

[79]  Mike Dowman,et al.  A Cross-linguistic Computational Investigation of the Learnability of Syntactic, Morpho-syntactic, and Phonological Structure , 1998 .

[80]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[81]  Denise Brandão de Oliveira e Britto,et al.  The faculty of language , 2007 .

[82]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[83]  Thomas L. Griffiths,et al.  Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models , 2006, NIPS.

[84]  R. Baillargeon,et al.  Infants' use of featural and experiential information in segregating and individuating objects: a reply to Xu, Carey and Welch (2000) , 2000, Cognition.

[85]  Linda B. Smith,et al.  How children know the relevant properties for generalizing object names , 2002 .

[86]  Massimo Piattelli-Palmarini,et al.  Language and Learning: The Debate Between Jean Piaget and Noam Chomsky , 1980 .

[87]  J. Bohannon,et al.  The issue of negative evidence: Adult responses to children's language errors. , 1988 .

[88]  G. Csibra,et al.  Teleological reasoning in infancy: the naı̈ve theory of rational action , 2003, Trends in Cognitive Sciences.

[89]  L. Gerken,et al.  Infants can use distributional cues to form syntactic categories , 2005, Journal of Child Language.

[90]  Peter Ford Dominey Learning Grammatical Constructions in a Miniature Language from Narrated Video Events , 2003 .

[91]  J. Tenenbaum,et al.  Poverty of the Stimulus? A Rational Approach , 2006 .

[92]  Kenneth C. Hill,et al.  The genesis of language , 1982 .

[93]  Charles D. Yang,et al.  Empirical re-assessment of stimulus poverty arguments , 2002 .

[94]  Jerome A. Feldman,et al.  Some Decidability Results on Grammatical Inference and Complexity , 1972, Inf. Control..

[95]  J. Fodor The Language of Thought , 1980 .

[96]  Robert L. Goldstone,et al.  Interactions Between Perceptual and Conceptual Learning , 2000 .

[97]  Mark A. Pitt,et al.  Advances in Minimum Description Length: Theory and Applications , 2005 .

[98]  Robert S. Siegler,et al.  U-Shaped Interest in U-Shaped Development-and What It Means , 2004 .

[99]  Thomas L. Griffiths,et al.  Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.

[100]  Linda B. Smith,et al.  Whose DAM account? Attentional learning explains Booth and Waxman , 2003, Cognition.

[101]  Toben H. Mintz,et al.  The distributional structure of grammatical categories in speech to young children , 2002 .

[102]  S. Crain,et al.  Structure dependence in grammar formation , 1987 .

[103]  Cynthia Fisher,et al.  On the semantic content of subcategorization frames , 1991, Cognitive Psychology.

[104]  Rutvik H. Desai Bootstrapping in miniature language acquisition , 2002, Cognitive Systems Research.

[105]  R N Aslin,et al.  Statistical Learning by 8-Month-Old Infants , 1996, Science.

[106]  S. Laurence,et al.  The Poverty of the Stimulus Argument , 2001, The British Journal for the Philosophy of Science.

[107]  G. Marcus Negative evidence in language acquisition , 1993, Cognition.

[108]  Susan A. Gelman,et al.  Inductions from novel categories: The role of language and conceptual structure , 1990 .

[109]  Susan A Gelman,et al.  Shape and representational status in children's early naming , 1998, Cognition.

[110]  K. Dieussaert,et al.  Proceedings of the 26th annual conference of the cognitive science society , 2004 .

[111]  Ming Li,et al.  Minimum description length induction, Bayesianism, and Kolmogorov complexity , 1999, IEEE Trans. Inf. Theory.

[112]  Anna L. Theakston,et al.  The role of frequency in the acquisition of English word order , 2005 .

[113]  P. Gordon Learnability and Feedback , 1990 .

[114]  Linda B. Smith,et al.  From the lexicon to expectations about kinds: a role for associative learning. , 2005, Psychological review.

[115]  M. Bowerman The 'no negative evidence' problem: How do children avoid constructing an overly general grammar? , 1988 .

[116]  Rajesh P. N. Rao,et al.  Bayesian brain : probabilistic approaches to neural coding , 2006 .

[117]  Rajesh P. N. Rao Bayesian Computation in Recurrent Neural Circuits , 2004, Neural Computation.

[118]  A. Goldberg Constructions: A Construction Grammar Approach to Argument Structure , 1995 .

[119]  Larissa K. Samuelson,et al.  Statistical regularities in vocabulary guide language acquisition in connectionist models and 15-20-month-olds , 2002 .

[120]  Peter Dayan,et al.  Probabilistic Computation in Spiking Populations , 2004, NIPS.

[121]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[122]  S. Waxman,et al.  Word learning is ‘smart’: evidence that conceptual information affects preschoolers' extension of novel words , 2002, Cognition.

[123]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[124]  Morten H. Christiansen,et al.  Uncovering the Richness of the Stimulus: Structure Dependence and Indirect Statistical Evidence , 2005, Cogn. Sci..

[125]  Sandra R. Waxman,et al.  Mapping Words to the World in Infancy: Infants' Expectations for Count Nouns and Adjectives , 2003 .

[126]  Eric Wanner,et al.  Language acquisition: the state of the art , 1982 .

[127]  H. White,et al.  Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.

[128]  Suzanne Stevenson,et al.  A Computational Model of Early Argument Structure Acquisition , 2008, Cogn. Sci..

[129]  Melissa Bowerman,et al.  Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. , 1982 .

[130]  D. Kirsh,et al.  Proceedings of the 25th annual conference of the Cognitive Science Society , 2003 .

[131]  T. Regier,et al.  Learning the unlearnable: the role of missing evidence , 2004, Cognition.

[132]  Linda B. Smith,et al.  Shape: A Developmental Product , 2005, Functional Features in Language and Space.

[133]  Irene Mazurkewich,et al.  The acquisition of the dative alternation: Unlearning overgeneralizations , 1984, Cognition.

[134]  Ted Briscoe Language learning, power laws, and sexual selection , 2006 .

[135]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[136]  Elizabeth S. Spelke,et al.  Principles of Object Perception , 1990, Cogn. Sci..

[137]  Linda B. Smith,et al.  An attentional learning account of the shape bias: reply to Cimpian and Markman (2005) and Booth, Waxman, and Huang (2005). , 2006, Developmental psychology.

[138]  Noam Chomsky,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[139]  Jerry A. Fodor,et al.  Representations: Philosophical Essays on the Foundations of Cognitive Science , 1981 .

[140]  Chris L. Baker,et al.  Goal Inference as Inverse Planning , 2007 .

[141]  J. Tenenbaum,et al.  Bayesian Special Section Learning Overhypotheses with Hierarchical Bayesian Models , 2022 .

[142]  S. Waxman Everything Had a Name, and Each Name Gave Birth to a New Thought: Links between Early Word Learning and Conceptual Organization. , 2004 .

[143]  J. Hayes Cognition and the development of language , 1970 .

[144]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[145]  Noam Chomsky,et al.  On Certain Formal Properties of Grammars , 1959, Inf. Control..

[146]  Mike Dowman,et al.  Addressing the Learnability of Verb Subcategorization with Bayesian Inference , 2000 .

[147]  D. G. Rees,et al.  Foundations of Statistics , 1989 .

[148]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[149]  Linda B. Smith,et al.  Early noun vocabularies: do ontology, category structure and syntax correspond? , 1999, Cognition.

[150]  Mark S. Seidenberg,et al.  The emergence of grammaticality in connectionist networks. , 1999 .

[151]  T. Deacon The Symbolic Species: The Co-evolution of Language and the Brain , 1998 .

[152]  M. Coltheart Attention and Performance XII: The Psychology of Reading , 1987 .

[153]  Alexander Clark,et al.  Learning Auxiliary Fronting with Grammatical Inference , 2006, CoNLL.

[154]  L. Markson,et al.  The shape of thought. , 2008, Developmental science.

[155]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[156]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[157]  Herbert Edelsbrunner,et al.  Edgewise Subdivision of a Simplex , 2000 .

[158]  Paul M. B. Vitányi,et al.  ‘Ideal learning’ of natural language: Positive results about learning from positive evidence , 2007 .

[159]  E. Clark,et al.  Adult reformulations of child errors as negative evidence , 2003, Journal of Child Language.

[160]  Michelle A. Hollander,et al.  Affectedness and direct objects: The role of lexical semantics in the acquisition of verb argument structure , 1991, Cognition.

[161]  Anna L. Theakston,et al.  The role of entrenchment in children's and adults' performance on grammaticality judgment tasks , 2004 .

[162]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[163]  James L. McClelland,et al.  Semantic Cognition: A Parallel Distributed Processing Approach , 2004 .

[164]  Kenneth Wexler,et al.  Formal Principles of Language Acquisition , 1980 .

[165]  James L. McClelland,et al.  Structure and deterioration of semantic memory: a neuropsychological and computational investigation. , 2004, Psychological review.

[166]  D. Dantzig Statistical priesthood (Savage on personal probabilities [1]) , 1957 .

[167]  R. Zemel,et al.  Inference and computation with population codes. , 2003, Annual review of neuroscience.

[168]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[169]  Caroline F. Rowland,et al.  Is Structure Dependence an Innate Constraint? New Experimental Evidence From Children's Complex-Question Production , 2008, Cogn. Sci..

[170]  Naomi H. Feldman,et al.  A Rational Account of the Perceptual Magnet Effect , 2007 .

[171]  Sourabh Niyogi Bayesian Learning at the Syntax-Semantics Interface , 2002 .

[172]  J. Tenenbaum,et al.  Generalization, similarity, and Bayesian inference. , 2001, The Behavioral and brain sciences.

[173]  Jason Eisner,et al.  Discovering syntactic deep structure via Bayesian statistics , 2002, Cogn. Sci..

[174]  B. Ambridge,et al.  The effect of verb semantic class and verb frequency (entrenchment) on children’s and adults’ graded judgements of argument-structure overgeneralization errors , 2008, Cognition.

[175]  Rory A. Fisher,et al.  Probability likelihood and quantity of information in the logic of uncertain inference , 1934 .

[176]  Padraic Monaghan,et al.  Proceedings of the 23rd annual conference of the cognitive science society , 2001 .

[177]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[178]  James Jay Horning,et al.  A study of grammatical inference , 1969 .

[179]  David Marr,et al.  VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[180]  Lori Markson,et al.  Intention and Analogy in Children's Naming of Pictorial Representations , 1998 .

[181]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[182]  Larissa K. Samuelson,et al.  Children's attention to rigid and deformable shape in naming and non-naming tasks. , 2000, Child development.

[183]  Michael Tomasello,et al.  Sampling children's spontaneous speech: how much is enough? , 2004, Journal of child language.

[184]  L. Gleitman,et al.  When we think about thinking: The acquisition of belief verbs , 2007, Cognition.

[185]  S Pinker,et al.  Overregularization in language acquisition. , 1992, Monographs of the Society for Research in Child Development.

[186]  Elizabeth Hughes,et al.  Proceedings of the 21th annual Boston University Conference on Language Development , 1997 .

[187]  Dan Isaac Slobin,et al.  The ontogenesis of grammar : a theoretical symposium , 1971 .

[188]  Noam Chomsky,et al.  Problems of Knowledge and Freedom , 1971 .

[189]  David M. Sobel,et al.  A theory of causal learning in children: causal maps and Bayes nets. , 2004, Psychological review.

[190]  M. Tomasello The item-based nature of children’s early syntactic development , 2000, Trends in Cognitive Sciences.

[191]  Partha Niyogi,et al.  Book Reviews: The Computational Nature of Language Learning and Evolution, by Partha Niyogi , 2007, CL.

[192]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[193]  Hinrich Schütze,et al.  Distributional Part-of-Speech Tagging , 1995, EACL.

[194]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[195]  F. Xu,et al.  Object individuation and object identity in infancy: the role of spatiotemporal information, object property information, and language. , 1999, Acta psychologica.

[196]  N. Chater,et al.  Simplicity: a unifying principle in cognitive science? , 2003, Trends in Cognitive Sciences.

[197]  Thomas L. Griffiths,et al.  A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[198]  J. Trueswell,et al.  The developing constraints on parsing decisions: The role of lexical-biases and referential scenes in child and adult sentence processing , 2004, Cognitive Psychology.

[199]  S. Pinker How could a child use verb syntax to learn verb semantics , 1994 .

[200]  E. Spelke,et al.  Science and Core Knowledge , 1996, Philosophy of Science.

[201]  R Sun,et al.  Proceedings of the 28th Annual Conference of the Cognitive Science Society , 2006 .

[202]  Anne L. Fulkerson,et al.  Words (but not Tones) facilitate object categorization: Evidence from 6- and 12-month-olds , 2007, Cognition.

[203]  S. Carey,et al.  The emergence of kind-based object individuation in infancy , 2004, Cognitive Psychology.

[204]  Andreas Stolcke,et al.  Inducing Probabilistic Grammars by Bayesian Model Merging , 1994, ICGI.

[205]  M. Tanenhaus,et al.  Acquiring and processing verb argument structure: Distributional learning in a miniature language , 2008, Cognitive Psychology.

[206]  Linda B. Smith,et al.  The importance of shape in early lexical learning , 1988 .

[207]  M. Tomasello,et al.  Young children's overgeneralizations with fixed transitivity verbs. , 1999, Child development.

[208]  C. A. Ferguson,et al.  Talking to Children: Language Input and Acquisition , 1979 .

[209]  Nick Chater,et al.  Distributional Information: A Powerful Cue for Acquiring Syntactic Categories , 1998, Cogn. Sci..

[210]  E. Jaynes Probability theory : the logic of science , 2003 .

[211]  A. Woodward Infants' ability to distinguish between purposeful and non-purposeful behaviors , 1999 .

[212]  J. F. Macario,et al.  Young children's use of color in classification: Foods and canonically colored objects , 1991 .

[213]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .