Learning Structured Generative Concepts

Learning Structured Generative Concepts Andreas Stuhlm uller, Joshua B. Tenenbaum, Noah D. Goodman Brain and Cognitive Sciences, MIT {ast, jbt, ndg}@mit.edu Abstract Many real world concepts, such as “car”, “house”, and “tree”, are more than simply a collection of features. These objects are richly structured, defined in terms of systems of relations, subparts, and recursive embeddings. We describe an approach to concept representation and learning that attempts to capture such structured objects. This approach builds on recent proba- bilistic approaches, viewing concepts as generative processes, and on recent rule-based approaches, constructing concepts in- ductively from a language of thought. Concepts are modeled as probabilistic programs that describe generative processes; these programs are described in a compositional language. In an exploratory concept learning experiment, we investigate hu- man learning from sets of tree-like objects generated by pro- cesses that vary in their abstract structure, from simple proto- types to complex recursions. We compare human categoriza- tion judgements to predictions of the true generative process as well as a variety of exemplar-based heuristics. Introduction Concept learning has traditionally been studied in the con- text of relatively unstructured objects that can be described as collections of features. Learning and categorization can be understood formally as problems of statistical inference, and a number of successful accounts of concept learning can be viewed in terms of probabilistic models defined over different ways to represent structure in feature sets, such as prototypes, exemplars, or logical rules (Anderson, 1990; Shi, Feldman, & Griffiths, 2008; Goodman, Tenenbaum, Feldman, & Grif- fiths, 2008). Yet for many real world object concepts, such as “car”, “house”, “tree, or “human body”, instances are more than simply a collection of features. These objects are richly structured, defined in terms of features connected in systems of relations, parts and subparts at multiple scales of abstrac- tion, and even recursive embedding (Markman, 1999). A tree has branches coming out of a trunk, with roots in the ground; branches give rise to smaller branches, and there are leaves at the end of the branches. A human body has a head on top of a torso; arms and legs come out of the torso, with arms ending in hands, made of fingers. A house is composed of walls, roofs, doors, and other parts arranged in characteristic functional and spatial relations that are harder to verbalize but still easy to recognize and reason about. Besides objects, ex- amples of structured concepts can be found in language (e.g. the mutually recursive system of phrase types in a grammar), in the representation of events (e.g. a soccer match with its fixed subparts), and processes (e.g. the recipe for making a pancake with steps at different levels of abstraction). Such concepts have not been the focus of research in the probabilistic modeling tradition. Here we describe an ap- proach to representing structured concepts—more typical of the complexity of real world categories—using probabilistic generative processes. We test whether statistical inference with these generative processes can account for how people categorize novel instances of structured concepts and com- pare with more heuristic, exemplar-based approaches. Because a structured concept like “house” has no single, simple perceptual prototype that is similar to all examples, learning such a concept might seem very difficult. However, each example of a structured concept itself has internal struc- ture which makes it potentially very informative. Consider figure 1, where from only a few observations of a concept it is easy to see the underlying structural regularity that can be extended to new items. The regularities underlying structured concepts can often be expressed with instructions for gener- ating the examples: “Draw a sequence of brown dots, choose a branch color, and for each brown dot draw two dots of this color branching from it.” Figure 1: Three examples of a structured concept described by a simple generative process. We build on the work of Goodman, Tenenbaum, et al. (2008), who introduced an approach to concept learning as Bayesian inference over a grammatically structured hypoth- esis space—a “language of thought.” Single concepts ex- pressed in this language were simple propositional rules for classifying objects, but this approach naturally extends to richer representations, providing a concept learning theory for any representation language. Here we consider a language for generative processes based on probabilistic programs: in- structions for constructing objects, which may include prob- abilistic choices, thus describing distributions on objects—in our case distributions on colored trees. Because this language describes generative processes as programs, it captures regu- larities as abstract as subparts and recursion. The theory of concept representation that we describe here shares many aspects with previous approaches to concepts. Like prototype and mixture models (Anderson, 1990; Grif- fiths, Canini, & Sanborn, 2007), probabilistic programs de- scribe distributions on observations. However, prototypes and mixtures generate observations as noisy copies of ideal pro- totypes for the concept and thus cannot capture more abstract structures such as recursion. Like rule-based models of con- cept learning, our approach supports compositionality: com- plex concepts are composed out of simple ones—but rather