The acquisition of inductive constraints

Human learners routinely make inductive inferences, or inferences that go beyond the data they have observed. Inferences like these must be supported by constraints, some of which are innate, although others are almost certainly learned. This thesis presents a hierarchical Bayesian framework that helps to explain the nature, use and acquisition of inductive constraints. Hierarchical Bayesian models include multiple levels of abstraction, and the representations at the upper levels place constraints on the representations at the lower levels. The probabilistic nature of these models allows them to make statistical inferences at multiple levels of abstraction. In particular, they show how knowledge can be acquired at levels quite remote from the data of experience—levels where the representations learned are naturally described as inductive constraints. Hierarchical Bayesian models can address inductive problems from many domains but this thesis focuses on models that address three aspects of high-level cognition. The first model is sensitive to patterns of feature variability, and acquires constraints similar to the shape bias in word learning. The second model acquires causal schemata—systems of abstract causal knowledge that allow learners to discover causal relationships given very sparse data. The final model discovers the structural form of a domain—for instance, it discovers whether the relationships between a set of entities are best described by a tree, a chain, a ring, or some other kind of representation. The hierarchical Bayesian approach captures several principles that go beyond traditional formulations of learning theory. It supports learning at multiple levels of abstraction, it handles structured representations, and it helps to explain how learning can succeed given sparse and noisy data. Principles like these are needed to explain how humans acquire rich systems of knowledge, and hierarchical Bayesian models point the way towards a modern learning theory that is better able to capture the sophistication of human learning. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  David R. Shanks Bayesian associative learning , 2006, Trends in Cognitive Sciences.

[2]  De Soto Cb,et al.  Learning a social structure. , 1960 .

[3]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[4]  H. Kelley The processes of causal attribution. , 1973 .

[5]  Friederike Range,et al.  Familiarity and dominance relations among female sooty mangabeys in the Taï National Park , 2002, American journal of primatology.

[6]  T. B. Ward,et al.  Attribute availability and the shape bias in children's category generalization , 1991 .

[7]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[8]  Noam Chomsky,et al.  The Logical Structure of Linguistic Theory , 1975 .

[9]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[10]  Larissa K. Samuelson,et al.  Statistical regularities in vocabulary guide language acquisition in connectionist models and 15-20-month-olds. , 2002, Developmental psychology.

[11]  Douglas A. Behrend,et al.  Constraints and development: A reply to Nelson (1988) , 1990 .

[12]  Linda B. Smith,et al.  Early noun vocabularies: do ontology, category structure and syntax correspond? , 1999, Cognition.

[13]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[14]  M. Leyton Symmetry, Causality, Mind , 1999 .

[15]  Elizabeth S. Spelke,et al.  Principles of Object Perception , 1990, Cogn. Sci..

[16]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[17]  P. Lazarsfeld,et al.  Mathematical Thinking in the Social Sciences. , 1955 .

[18]  D Norris,et al.  Merging information in speech recognition: Feedback is never necessary , 2000, Behavioral and Brain Sciences.

[19]  J. Q. Smith,et al.  1. Bayesian Statistics 4 , 1993 .

[20]  A. Fiske The four elementary forms of sociality: framework for a unified theory of social relations. , 1992, Psychological review.

[21]  R N Shepard,et al.  Multidimensional Scaling, Tree-Fitting, and Clustering , 1980, Science.

[22]  L. Laudan Progress and Its Problems , 1977 .

[23]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[24]  D. Aldous Exchangeability and related topics , 1985 .

[25]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[26]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[27]  Yiming Yang,et al.  Stochastic link and group detection , 2002, AAAI/IAAI.

[28]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[29]  Leslie B. Cohen,et al.  The Role of Object Parts in Infants' Attention to Form-Function Correlations. , 1995 .

[30]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[31]  Linda B. Smith,et al.  The importance of shape in early lexical learning , 1988 .

[32]  L. Guttman A basis for scaling qualitative data. , 1944 .

[33]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[34]  P. Cheng,et al.  Assessing interactive causal influence. , 2004, Psychological review.

[35]  渡辺 慧,et al.  Knowing and guessing : a quantitative study of inference and information , 1969 .

[36]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[37]  C. Glymour The Mind's Arrows: Bayes Nets and Graphical Causal Models in Psychology , 2000 .

[38]  N. Chater,et al.  Précis of Bayesian Rationality: The Probabilistic Approach to Human Reasoning , 2009, Behavioral and Brain Sciences.

[39]  R. Mooney,et al.  Explanation-Based Learning: An Alternative View , 1986, Machine Learning.

[40]  F. Sommers Types and Ontology , 1963 .

[41]  D. Medin,et al.  The role of theories in conceptual coherence. , 1985, Psychological review.

[42]  K J Holyoak,et al.  Distributional expectations and the induction of category structure. , 1986, Journal of experimental psychology. Learning, memory, and cognition.

[43]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[44]  T. Shultz Computational Developmental Psychology , 2003 .

[45]  Daniel Gildea,et al.  Learning Bias and Phonological-Rule Induction , 1996, CL.

[46]  David H. Wolpert,et al.  The Relationship Between PAC, the Statistical Physics Framework, the Bayesian Framework, and the VC Framework , 1995 .

[47]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[48]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[49]  G. Pólya,et al.  Mathematics and Plausible Reasoning , 1956 .

[50]  J. Tenenbaum,et al.  Bayesian Special Section Learning Overhypotheses with Hierarchical Bayesian Models , 2022 .

[51]  J. Piaget The Child's Conception of Number , 1953 .

[52]  L. R. Novick Representational Transfer in Problem Solving , 1990 .

[53]  James L. McClelland,et al.  On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .

[54]  R. Whittaker,et al.  GRADIENT ANALYSIS OF VEGETATION* , 1967, Biological reviews of the Cambridge Philosophical Society.

[55]  Yasuaki Sakamoto,et al.  Schematic influences on category learning and recognition memory. , 2004, Journal of experimental psychology. General.

[56]  John Price-Wilkin,et al.  Oxford English Dictionary (2nd ed.) , 1991 .

[57]  I. Sigel,et al.  HANDBOOK OF CHILD PSYCHOLOGY , 2006 .

[58]  M. Bullock,et al.  Preschoolers' understanding of simple object transformations. , 1980, Child development.

[59]  James L. McClelland,et al.  Semantic Cognition: A Parallel Distributed Processing Approach , 2004 .

[60]  S Ullman,et al.  Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. , 1995, Cerebral cortex.

[61]  T. Regier Emergent constraints on word-learning: a computational perspective , 2003, Trends in Cognitive Sciences.

[62]  D. Heckerman,et al.  Density Modeling and Clustering Using Dirichlet Diffusion Trees , 2003 .

[63]  M. Kubovy,et al.  Auditory and visual objects , 2001, Cognition.

[64]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[65]  H. A. David,et al.  The method of paired comparisons , 1966 .

[66]  N Moray,et al.  A lattice theory approach to the structure of mental models. , 1990, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[67]  F. Keil Constraints on knowledge and cognitive development. , 1981 .

[68]  David Hume,et al.  An enquiry concerning human understanding and other writings , 2007 .

[69]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[70]  F. Bartlett,et al.  Remembering: A Study in Experimental and Social Psychology , 1932 .

[71]  Richard M. Lerner,et al.  Theoretical models of human development , 2006 .

[72]  Joshua B. Tenenbaum,et al.  Theory-Based Induction , 2003 .

[73]  Noam Chomsky,et al.  Language and problems of knowledge : the Managua lectures , 1990 .

[74]  E. Heit,et al.  Similarity and property effects in inductive reasoning. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[75]  L. Freeman Filling in the Blanks: A Theory of Cognitive Categories and the Structure of Social Affiliation , 1992 .

[76]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[77]  Raymond J. Bandlow Theories of Learning, 4th Edition. By Ernest R. Hilgard and Gordon H. Bower. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1975 , 1976 .

[78]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[79]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[80]  M. Minami How Children Learn the Meanings of Words , 2001 .

[81]  James L. McClelland,et al.  The TRACE model of speech perception , 1986, Cognitive Psychology.

[82]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[83]  J. Tenenbaum,et al.  Poverty of the Stimulus? A Rational Approach , 2006 .

[84]  Charles White,et al.  An Account of the Regular Gradation in Man, and in Different Animals and Vegetables; and from the Former to the Latter: Illustrated with Engravings Adapted to the Subject , 1799, The Medical and Physical Journal.

[85]  J. Tenenbaum,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Theory-based Bayesian models of inductive learning and reasoning , 2022 .

[86]  H. Wellman,et al.  Knowledge acquisition in foundational domains. , 1998 .

[87]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[88]  M. Ross Quillian,et al.  Retrieval time from semantic memory , 1969 .

[89]  Nick Chater,et al.  A rational analysis of the selection task as optimal data selection. , 1994 .

[90]  Robert L. Goldstone,et al.  The development of features in object concepts , 1998, Behavioral and Brain Sciences.

[91]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[92]  Douglas L. Medin,et al.  Context theory of classification learning. , 1978 .

[93]  Linda B. Smith,et al.  Object properties and knowledge in early lexical learning. , 1991, Child development.

[94]  Elizabeth F. Shipley,et al.  Categories, hierarchies, and induction , 1993 .

[95]  F. Heider The psychology of interpersonal relations , 1958 .

[96]  Vikash K. Mansinghka,et al.  Learning Cross-cutting Systems of Categories , 2006 .

[97]  D. Medin,et al.  Comments on part I: psychological essentialism , 1989 .

[98]  Noam Chomsky,et al.  Rules and Representations , 1982 .

[99]  Linda B. Smith,et al.  How children know the relevant properties for generalizing object names , 2002 .

[100]  E. Rosch,et al.  Cognition and Categorization , 1980 .

[101]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[102]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[103]  J. S. Wiggins,et al.  An informal history of the interpersonal circumplex tradition. , 1996, Journal of personality assessment.

[104]  Joshua B. Tenenbaum,et al.  Learning Causal Laws , 2003 .

[105]  Massimo Piattelli-Palmarini,et al.  Language and Learning: The Debate Between Jean Piaget and Noam Chomsky , 1980 .

[106]  F. Harary,et al.  Exchange in Oceania: A Graph Theoretic Analysis , 1991 .

[107]  J. Fleishman,et al.  Types of Political Attitude Structure: Results of a Cluster Analysis , 1986 .

[108]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[109]  David M. Sobel,et al.  Detecting blickets: how young children use information about novel causal powers in categorization and induction. , 2000, Child development.

[110]  Mahé Ben Hamed Neighbour-nets portray the Chinese dialect continuum and the linguistic legacy of China's demic history , 2005, Proceedings of the Royal Society B: Biological Sciences.

[111]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[112]  J. Tenenbaum,et al.  Nonsense and Sensibility: Inferring Unseen Possibilities , 2006 .

[113]  Paul M. B. Vitányi,et al.  ‘Ideal learning’ of natural language: Positive results about learning from positive evidence , 2007 .

[114]  H. Reichenbach Experience And Prediction , 1938 .

[115]  Karen Wynn,et al.  Addition and subtraction by human infants , 1992, Nature.

[116]  T. Shultz,et al.  Generative connectionist networks and constructivist cognitive development , 1996 .

[117]  C. D. De Soto,et al.  Learning a social structure. , 1960, Journal of abnormal and social psychology.

[119]  Mutsumi Imai,et al.  Children's Theories of Word Meaning: The Role of Shape Similarity in Early Acquisition , 1994 .

[120]  Gerd Gigerenzer,et al.  The "conjunction fallacy" revisited : How intelligent inferences look like reasoning errors , 1999 .

[121]  E. Spelke Initial knowledge: six suggestions , 1994, Cognition.

[122]  R. French,et al.  The Importance of Starting Blurry: Simulating Improved Basic-Level Category Learning in Infants Due to Weak Visual Acuity , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[123]  Eve V. Clark,et al.  The principle of contrast: A constraint on language acquisition. , 1987 .

[124]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[125]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[126]  Amy M. Masnick,et al.  The Development of Causal Reasoning , 2007 .

[127]  H. Harlow,et al.  The formation of learning sets. , 1949, Psychological review.

[128]  V. Mcgee Multidimensional Scaling Of N Sets Of Similarity Measures: A Nonmetric Individual Differences Approach. , 1968, Multivariate behavioral research.

[129]  D. Krantz,et al.  The use of statistical heuristics in everyday inductive reasoning , 1983 .

[130]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[131]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[132]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[133]  E. Spelke,et al.  Ontological categories guide young children's inductions of word meaning: Object terms and substance terms , 1991, Cognition.

[134]  Lori L. Holt,et al.  Are there interactive processes in speech perception? , 2006, Trends in Cognitive Sciences.

[135]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[136]  H. Kelley Causal schemata and the attribution process , 1972 .

[137]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[138]  J. Piaget,et al.  The Psychology of the Child , 1969 .

[139]  Doug Jones,et al.  The generative psychology of kinship: Part 1. Cognitive universals and evolutionary psychology , 2003 .

[140]  Eric Bapteste,et al.  INAUGURAL ARTICLE by a Recently Elected Academy Member:Pattern pluralism and the Tree of Life hypothesis , 2007 .

[141]  Duncan MacRae join,et al.  Direct Factor Analysis of Sociometric Data , 1960 .

[142]  Tapabrata Maiti,et al.  Bayesian Data Analysis (2nd ed.) (Book) , 2004 .

[143]  A. Gopnik The Scientist as Child , 1996, Philosophy of Science.

[144]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[145]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[146]  H. Wellman,et al.  Cognitive development: foundational theories of core domains. , 1992, Annual review of psychology.

[147]  G Turkewitz,et al.  Limitations on input as a basis for neural organization and perceptual development: a preliminary theoretical statement. , 1982, Developmental psychobiology.

[148]  B. Malinowski Argonauts of the Western Pacific: An Account of Native Enterprise and Adventure in the Archipelagoes of Melanesian New Guinea , 2002 .

[149]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[150]  Bernard Grofman,et al.  Identifying the Median Justice on the Supreme Court through Multidimensional Scaling: Analysis of “Natural Courts” 1953–1991 , 2002 .

[151]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[152]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[153]  P. Spirtes,et al.  Causation, Prediction, and Search, 2nd Edition , 2001 .

[154]  T. Shallice,et al.  CONTENTION SCHEDULING AND THE CONTROL OF ROUTINE ACTIVITIES , 2000, Cognitive neuropsychology.

[155]  Linda B. Smith,et al.  Naming in young children: a dumb attentional mechanism? , 1996, Cognition.

[156]  Paul E. Meehl,et al.  Multivariate Taxometric Procedures: Distinguishing Types from Continua , 1997 .

[157]  Clyde Wilcox,et al.  The Dimensionality of Roll-Call Voting Reconsidered , 1991 .

[158]  Thomas R. Schultz,et al.  A Connectionist Model of the Development of Transitivity , 2004 .

[159]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[160]  Zoubin Ghahramani,et al.  Semi-supervised learning : from Gaussian fields to Gaussian processes , 2003 .

[161]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .

[162]  P. Jusczyk,et al.  A precursor of language acquisition in young infants , 1988, Cognition.

[163]  J. Fodor Modularity of mind , 1983 .

[164]  J. Carroll Spatial, non-spatial and hybrid models for scaling , 1976 .

[165]  Robert L. Goldstone,et al.  Conceptual development from origins to asymptotes , 2003 .

[166]  D. Sperber Are folk taxonomies “memes”? , 1998, Behavioral and Brain Sciences.

[167]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[168]  Linda B. Smith,et al.  Object name Learning Provides On-the-Job Training for Attention , 2002, Psychological science.

[169]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[170]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[171]  G. Ekman Dimensions of Color Vision , 1954 .

[172]  Patricia W. Cheng,et al.  Separating Causal Laws from Casual Facts: Pressing the Limits of Statistical Relevance , 1993 .

[173]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[174]  L. Rips Similarity, typicality, and categorization , 1989 .

[175]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[176]  David G. Stork,et al.  Pattern Classification , 1973 .

[177]  C. Gallistel,et al.  The Child's Understanding of Number , 1979 .

[178]  W. Quine Ontological Relativity and Other Essays , 1969 .

[179]  I. Good Some history of the hierarchical Bayesian methodology , 1980 .

[180]  John R. Anderson The Adaptive Character of Thought , 1990 .

[181]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[182]  J. Tenenbaum,et al.  The Rational Basis of Representativeness , 2001 .

[183]  Evan Heit,et al.  A Bayesian Analysis of Some Forms of Inductive Reasoning , 1998 .

[184]  A. Tversky,et al.  Spatial versus tree representations of proximity data , 1982 .

[185]  C. Gallistel The Replacement of General-Purpose Learning Models with Adaptively Specialized Learning Modules , 2000 .

[186]  G. Deák Hunting the Fox of Word Learning: Why "Constraints" Fail To Capture It. , 2000 .

[187]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[188]  L. L. Thurstone,et al.  The learning curve equation , 1919 .

[189]  E. Hilgard,et al.  Theories of Learning , 1981 .

[190]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[191]  Katherine Nelson,et al.  Constraints on word learning , 1988 .

[192]  D. Medin,et al.  Family resemblance, conceptual cohesiveness, and category construction , 1987, Cognitive Psychology.

[193]  R. Burchfield Oxford English dictionary , 1982 .

[194]  Arthur B. Markman,et al.  Safe Takeoffs—Soft Landings , 1990 .

[195]  D. Shanks,et al.  FEATURE- AND RULE-BASED GENERALIZATION IN HUMAN ASSOCIATIVE LEARNING , 1998 .

[196]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[197]  A. Gopnik,et al.  Words, thoughts, and theories , 1997 .

[198]  I. Kant,et al.  Critique of Pure Reason: Glossary , 1998 .

[199]  H. Ebbinghaus Über das Gedächtniss: Untersuchungen zur experimentellen Psychologie , 1885 .

[200]  Benjamin Kuipers,et al.  Bootstrap learning of foundational representations , 2006, Connect. Sci..

[201]  Rochel Gelman,et al.  Enabling constraints for cognitive development and learning: Domain specificity and epigenesis. , 1998 .

[202]  Linda B Smith,et al.  The emergence of abstract ideas: evidence from networks and babies. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[203]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization. , 1991 .

[204]  Marie desJardins,et al.  Evaluation and selection of biases in machine learning , 1995, Machine Learning.

[205]  Linda B. Smith,et al.  From the lexicon to expectations about kinds: a role for associative learning. , 2005, Psychological review.

[206]  E. Markman,et al.  Word learning in children: an examination of fast mapping. , 1987, Child development.

[207]  Terry Regier,et al.  The Human Semantic Potential: Spatial Language and Constrained Connectionism , 1996 .

[208]  Carl G. Hempel,et al.  Fundamentals of Concept Formation in Empirical Science , 1952 .

[209]  J. Lake,et al.  The ring of life provides evidence for a genome fusion origin of eukaryotes , 2004, Nature.

[210]  Refractor Vision , 2000, The Lancet.

[211]  J. Earman,et al.  Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory , 1994 .

[212]  R. S. Woolhouse,et al.  Locke's philosophy of science and knowledge: A consideration of some aspects of An essay concerning human understanding, , 1971 .

[213]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[214]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[215]  York Hagmayer,et al.  Categories and causality: The neglected direction , 2006, Cognitive Psychology.

[216]  Elissa L. Newport,et al.  Maturational Constraints on Language Learning , 1990, Cogn. Sci..

[217]  N. Goodman Fact, Fiction, and Forecast , 1955 .

[218]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[219]  B. Bower A Child's Theory of Mind , 1993 .

[220]  C. L. Hull Principles of Behavior , 1945 .

[221]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[222]  D. Billman,et al.  Induction from a single instance: formation of a novel category. , 1990, Journal of experimental child psychology.

[223]  H Gleitman,et al.  Spatial knowledge and geometric representation in a child blind from birth. , 1981, Science.

[224]  Hartmut Ehrig,et al.  Handbook of graph grammars and computing by graph transformation: vol. 3: concurrency, parallelism, and distribution , 1999 .

[225]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[226]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[227]  L. G. Neuberg,et al.  Bayes or Bust?-A Critical Examination of Bayesian Confirmation Theory. , 1994 .

[228]  J. Tenenbaum,et al.  Structure and strength in causal induction , 2005, Cognitive Psychology.

[229]  Sean Nee,et al.  The great chain of being , 2005, Nature.

[230]  E. Hilgard,et al.  Hilgard and Marquis' conditioning and learning (2nd ed.). , 1961 .

[231]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[232]  Herbert A. Simon Cognitive Architectures and Rational Analysis: Comment , 1989 .

[233]  A. Greenwald LEVELS OF REPRESENTATION , 1988 .

[234]  S. Carey Conceptual Change in Childhood , 1985 .

[235]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[236]  Stuart J. Russell,et al.  A Logical Approach to Reasoning by Analogy , 1987, IJCAI.

[237]  D. Medin,et al.  SUSTAIN: a network model of category learning. , 2004, Psychological review.

[238]  B. Skinner,et al.  Principles of Behavior , 1944 .

[239]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[240]  S. Boorman,et al.  Social structure from multiple networks: I , 1976 .

[241]  Jonathan Baxter,et al.  A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling , 1997, Machine Learning.

[242]  G. Kimble,et al.  Hilgard and Marquis' Conditioning and learning , 1961 .

[243]  Patrick Suppes Concept Formation and Bayesian Decisions , 1966 .

[244]  F. Keil Concepts, Kinds, and Cognitive Development , 1989 .

[245]  P. Cheng,et al.  Distinguishing Genuine from Spurious Causes: A Coherence Hypothesis , 2000, Cognitive Psychology.

[246]  Joost Engelfriet,et al.  Node Replacement Graph Grammars , 1997, Handbook of Graph Grammars.

[247]  D. Lewkowicz,et al.  A dynamic systems approach to the development of cognition and action. , 2007, Journal of cognitive neuroscience.

[248]  D. George,et al.  A hierarchical Bayesian model of invariant pattern recognition in the visual cortex , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[249]  M. Meulders,et al.  A conceptual and psychometric framework for distinguishing categories and dimensions. , 2005, Psychological review.

[250]  F. Keil Constraints on Constraints: Surveying the Epigenetic Landscape , 1990, Cognitive Sciences.

[251]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[252]  Craig R. M. McKenzie,et al.  Rational models as theories – not standards – of behavior , 2003, Trends in Cognitive Sciences.

[253]  Marvin Minsky,et al.  A framework for representing knowledge" in the psychology of computer vision , 1975 .

[254]  D. H. Wheeler,et al.  The early growth of logic in the child : classification and seriation , 1965 .

[255]  M. Raijmakers Rethinking innateness: A connectionist perspective on development. , 1997 .

[256]  A. Ortony,et al.  Similarity and Analogical Reasoning , 1991 .

[257]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[258]  P. Cheng From covariation to causation: A causal power theory. , 1997 .

[259]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[260]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[261]  T. Shultz Rules of Causal Attribution. , 1982 .

[262]  Jerome Bruner A short history of psychological theories of learning , 2004, Daedalus.