In traditional theories of semantic memory, performance of semantic tasks relies upon a mediating process of categorization. However, categorizationbased theories do not capture the complex and flexible ways in which people use their conceptual knowledge to perform natural semantic tasks imposed on them by the environment. For example, both children and adults understand that a given property may be important for categorizing some kinds of objects, but not others; that different kinds of properties generalize across different groups of objects; and that insides can be more important for determining category membership than outsides. Consequently, some researchers propose to describe conceptual knowledge in terms of naive theories about causal mechanisms. In the current work, we present simulations using a simple connectionist network that learns the mappings between objects and their properties in different contexts. We show that the evolution of representations throughout learning in our model constrains the ease with which particular object properties can be learned, and how they will generalize. The configuration of weights at any point during development may provide the kinds of ‘enabling constraints’ on acquisition that some researchers attribute to naive theories. Many of the phenomena that arise in the theory-theory tradition may be understood within this framework. Knowledge about how object properties vary across contexts is stored in connection weights that are learned from experience. This knowledge plays the role that naive theories play in the theory-theory framework. ARE THEORIES NECESSARY? 2 Introduction Theories of conceptual knowledge that emphasize the role of learning and experience in acquisition have come under fire in recent years. Such theories are generally thought to be too underconstrained to adequately explain conceptual development, without additional explanatory constructs, such as implicit theories. Among the phenomena that would seem to support this view are the following: Illusory correlations: children and adults may create or enhance some object-property correlations, while ignoring others. Feature centrality: A given feature may be important for some categories of objects, but not others. Flexible generalization: Children and adults can generalize their knowledge in ways that challenge simple similarity-based mechanisms. Expertise: Different kinds of experts may acquire different representations of objects in the same domain. We have been investigating the capacity of the parallel distributed processing (PDP) framework to provide a general theory of semantic memory. Our approach builds upon earlier work by Hinton (1981) and Rumelhart (Rumelhart, Smolensky, McClelland, & Hinton, 1986; Rumelhart & Todd, 1993). Under the PDP theory, semantic memory is encoded in the weights of a connectionist network, which must learn the mappings between objects and their properties in different contexts. Domain-general learning mechanisms sensitive to the structure of the environment lead the system to gradually acquire correct mappings; and in so doing, to discover abstract, distributed representations of objects that capture their deep similarity relations in the context of a particular task. Thus, under this view, the development of conceptual knowledge is largely driven by experience. Learned similarities among the system’s internal representations provide a mechanism for knowledge generalization and induction. However, because knowledge about a given object and a particular task both provide graded constraints on the system’s internal states, different kinds of knowledge may generalizae across different groups of objects. We believe this framework provides a powerful set of tools for understanding human performance in semantic tasks. However, the PDP theory clearly relies to a great extent on mechanisms of learning to explain conceptual development. How might it explain the empirical observations that seem to undermine learning-based theories? A simple implementation of the theory, adapted from Rumelhart and Todd (1993), is shown in Figure 1. Input units appear on the left, and activation propagates from the left to the right. Where connections are indicated, every unit in the pool on the left is connected to every unit in the pool to the right. Each unit in the Item layer corresponds to an individual object in the environment. Each unit in the Context layer represents contextual constraints on the kind of information to be retrieved. Thus, the input pair canary can corresponds to a situation in which the network is shown a picture of a canary, and asked what it can do. The network is trained to turn on all those units that represent correct completions of the input query. In the example shown, the correct units to activate are grow, move, fly, and sing. To find a set of weights that allow the model to perform correctly, it is trained with backpropagation. As small changes to the weights accumulate, the network gradually acquires distributed internal representations of the various items that capture their semantic relations. ARE THEORIES NECESSARY? 3 ISA is can has pine oak maple birch rose daisy tulip sunflower robin canary sparrow penguin sunfish salmon flounder cod dog cat mouse goat pig CONTEXT REPRESENTATION ATTRIBUTES ITEM living pretty green tall red yellow white twirly bright dull large small grow move swim fly walk sing bark branches petals wings feathers scales gills leaves roots skin legs fur ITEM IN CONTEXT 1 2 3 Figure 1. A simple feed-forward implementation of the theory, based on the model proposed by Rumelhart and Todd (1993). The first layer of weights maps each individual input unit to a distributed pattern of activity across the units in the layer labeled Representation. Initially, all the weights in the network are small and random, and the patterns of activity corresponding to various items are all similar. As the network’s weights change to improve its performance, these internal representations gradually differentiate. Figure 2 shows a multidimensional scaling of the network’s internal representations of all 21 items, at ten different points during training. The proximity of points in the diagram indicates the degree to which their internal representations are similar. Each line corresponds to a single item, and traces the trajectory of that item’s representation throughout learning. The figure shows that initially, all representations are similar to one another. The model first differentiates items into global categories (plants and animals), and only later differentiates finer-grained categories. To the extent that two items have similar representations, the network is pressured to generalize its knowledge from one to the other. ARE THEORIES NECESSARY? 4
[1]
D. Medin,et al.
Categorization and Reasoning among Tree Experts: Do All Roads Lead to Rome?
,
1997,
Cognitive Psychology.
[2]
Peter M. Todd,et al.
Learning and connectionist representations
,
1993
.
[3]
B. Bertenthal,et al.
The Epigenesis of Mind: Essays on Biology and Cognition
,
1993
.
[4]
R. Gelman,et al.
Preschooler's Ability to Decide Whether a Photographed Unfamiliar Object Can Move Itself.
,
1988
.
[5]
E. Markman,et al.
Categories and induction in young children
,
1986,
Cognition.
[6]
Geoffrey E. Hinton,et al.
Schemata and Sequential Thought Processes in PDP Models
,
1986
.
[7]
J. F. Macario,et al.
Young children's use of color in classification: Foods and canonically colored objects
,
1991
.