Modeling infant learning via symbolic structural alignment - eScholarship

Modeling infant learning via symbolic structural alignment Sven E. Kuehne ( skuehne@ils.nwu.edu ) Department of Computer Science, Northwestern University 1890 Maple Avenue, Evanston, IL 60201 USA Dedre Gentner ( gentner@nwu.edu ) Department of Psychology, Northwestern University 2029 Sheridan Rd., Evanston, IL 60201 USA Kenneth D. Forbus ( forbus@ils.nwu.edu ) Department of Computer Science, Northwestern University 1890 Maple Avenue, Evanston, IL 60201 USA Abstract Understanding the mechanisms of learning is one of the cen- tral questions of Cognitive Science. Recently Marcus et al. showed that seven-month-old infants can learn to recognize regularities in simple language-like stimuli. Marcus proposed that these results could not be modeled via existing connec- tionist systems, and that such learning requires infants to be constructing rules containing algebraic variables. This paper proposes a third possibility: that such learning can be ex- plained via structural alignment processes operating over structured representations. We demonstrate the plausibility of this approach by describing a simulation, built out of previ- ously tested models of symbolic similarity processing, that models the Marcus data. Unlike existing connectionist simu- lations, our model learns within the span of stimuli presented to the infants and does not require supervision. It can handle input with and without noise. Contrary to Marcus’ proposal, our model does not require the introduction of variables. It incrementally abstracts structural regularities, which do not need to be fully abstract rules for the phenomenon to appear. Our model also proposes a processing explanation for why in- fants attend longer to the novel stimuli. We describe our model and the simulation results and discuss the role of struc- tural alignment in the development of abstract patterns and rules. Introduction Understanding the mechanisms of learning is one of the cen- tral questions of cognitive science. Recent studies (Gomez & Gerken, 1999; Marcus, Vijayan, Rao & Vishton, 1999) have shown that showed that infants as young as seven months can process simple language-like stimuli and build generali- zations sufficient to distinguish familiar from unfamiliar patterns in novel test stimuli. In Marcus et al’s study, the stimuli were simple ‘sentences,’ each consisting of three nonsense consonant-vowel ‘words’ (e.g., ‘ba’, ‘go’, ‘ka’). All habituation stimuli had a shared grammar, either ABA or ABB. In ABA-type stimuli the first and the third word are the same: e.g, ‘pa-ti-pa.’ In ABB-type stimuli the second and the third word are identical: e.g., ‘le-di-di’. The infants were habituated on 16 such sentences, with three repetitions for each sentence. The infants were then tested on a different set of sentences that consisted of entirely new words. Half of the test stimuli followed the same grammar as in the habitua- tion phase; the other half followed the non-trained grammar. Marcus et al. found that the infants dishabituated signifi- cantly more often to sentences in the non-trained pattern than to sentences in the trained pattern. Based on these findings Marcus et al. proposed that in- fants had learned abstract algebraic rules. They noted that these results cannot be accounted for solely by statistical mechanisms that track transitional probabilities. They fur- ther argue that their results challenge connectionist models of human learning that use similar information, on two grounds: (1) the infants learn in many fewer trials than are typically needed by connectionist learning systems; (2) more importantly, the infants learn without feedback. In particular, Marcus et al. demonstrated that a simple recurrent network with the same input stimuli could not model this learning task. In response, several connectionist models have attempted to simulate these findings. Unfortunately, all of them to date include extra assumptions that make them a relatively poor fit for the Marcus et al experiment. For example, Elman (1999; Seidenberg & Elman, 1999) use massive pre-training (50,000 trials) to teach the network the individual stimuli. More importantly, they turn the infants’ unsupervised learn- ing task into a supervised learning task by providing the network with external training signals. Other models tailored to capture the data of the study seem unlikely to be applica- ble to other similar cognitive tasks (Altmann & Dienes, 1999). Using a localist temporal binding scheme, Shastri and Chang (1999) model the infant results without pretrain- ing and without supervision, but still require an order of magnitude more exposure to the stimuli than the infants re- ceived. We propose a third alternative. There is evidence that structural alignment processes operating over symbolic structured representations participate in a number of cogni- tive processes, including analogy and similarity (Gentner, 1983), categorization (Markman & Gentner, 1993), detec- tion of symmetry and regularity (Ferguson, 1994), and learn-