Modeling disambiguation in word learning via multiple probabilistic constraints

Modeling disambiguation in word learning via multiple probabilistic constraints Molly Lewis Michael C. Frank mll@stanford.edu Department of Psychology Stanford University mcfrank@stanford.edu Department of Psychology Stanford University Abstract of the grocery store. There are an infinite number of possi- ble meanings of this word given this referential context, but the child’s ability to correctly disambiguate would lead her to rule out all meanings for which she already had a name. With this restricted hypothesis space, the child is more likely to identify the correct referent than if all objects in the context were considered as possible referents. What are the cognitive processes underlying this effect? There are broadly two proposals in the literature. Under one proposal, Markman and colleagues (1988; 2003) suggest that children have a constraint on the types of lexicons considered when learning the meaning of a new word — a “mutual exclu- sivity constraint.” With this constraint, children are biased to consider only those lexicons that have a one-to-one mapping between words and objects. Importantly, this constraint can be overcome in cases where it is incorrect (e.g. adjectives), but it nonetheless serves to restrict the set of lexicons initially entertained when learning the meaning of a novel word. Un- der this view, then, the disambiguation effect emerges from a constraint on the structure of lexicons. Under a second proposal, the disambiguation effect is ar- gued to result from online inferences made within the refer- ential context (Clark, 1987; Diesendruck & Markson, 2001). Clark suggests that the disambiguation effect is due to two pragmatic assumptions held by speakers. The first assump- tion is that speakers within the same speech community use the same words to refer to the same objects (“Principle of Conventionality”). The second assumption is that different linguistic forms refer to different meanings (“Principle of Contrast”). In the disambiguation task described above, then, children might reason (implicitly) as follows: You used a word I’ve never heard before. Since, presumably we both call a ball “ball” and if you’d meant the ball you would have said “ball,” this new word must refer to the new object. Thus, un- der this account, disambiguation emerges not from a higher- order constraint on the structure of lexicons, but instead from in-the-moment inferences using general pragmatic principles. These two proposals have traditionally been viewed as competing explanations of the disambiguation effect. Re- search in this area has consequently focused on identifying empirical tests that can distinguish between these two theo- ries. For example, Diesendruck and Markson (2001) com- pare performance on a disambiguation task when children are told a novel fact about an object relative to a novel ref- erential label. They found that children disambiguated in both conditions and argued on grounds of parsimony that the same pragmatic mechanism was likely to be responsible for both inferences. More recent evidence contradicts this Young children tend to map novel words to novel objects even in the presence of familiar competitors, a finding that has been dubbed the “disambiguation” effect. Theoretical accounts of this effect have debated whether it is due to initial constraints on children’s lexicons (e.g. a principle of mutual exclusivity) or situation-specific pragmatic inferences. We suggest that both could be true. We present a hierarchical Bayesian model that implements both situation-level and hierarchical inference, and show that both can in principle contribute to disambiguation inferences with different levels of strength depending on differ- ences in the situation and language experience of the learner. We additionally present data testing a novel prediction of this probabilistic view of disambiguation. Keywords: Word learning; mutual exclusivity; Bayesian mod- els. Introduction A central property of language is that each word in the lexicon maps to a unique concept, and each concept maps to a unique word (Clark, 1987). Like other important regularities in lan- guage (e.g. grammatical categories), children cannot directly observe this general property. Instead, they must learn to use language in a way that is consistent with this generalization on the basis of evidence about only specific word-object pairs. Even very young children behave in a way that is consis- tent with the one-to-one mapping between words and con- cepts in language. Evidence for this claim comes from what is known as the “disambiguation” effect. In a typical demon- stration of this effect (e.g. Markman & Wachtel, 1988), chil- dren are presented with a novel and familiar object (e.g. a whisk and a ball), and are asked to identify the referent of a novel word (“show me the dax”). Children in this task tend to choose the novel object as the referent, behaving in a way that is consistent with the one-to-one word-concept regularity in language, across a wide range of ages and ex- perimental paradigms (Mervis, Golinkoff, & Bertrand, 1994; Golinkoff, Mervis, Hirsh-Pasek, et al., 1994; Markman, Wa- sow, & Hansen, 2003; Halberda, 2003; Bion, Borovsky, & Fernald, 2013). This effect has received much attention in the word learn- ing literature because the ability to identify the meaning of a word in ambiguous contexts is, in essence, the core prob- lem of word learning. That is, given any referential context, the meaning of a word is underdetermined (Quine, 1960), and the challenge for the world learner is to identify the referent of the word within this ambiguous context. Critically, the ability to infer that a novel word maps to a novel object makes the problem much easier to solve. For example, suppose a child hears the novel word “kumquat” while in the produce aisle

[1]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[2]  A. Fernald,et al.  Fast mapping, slow learning: Disambiguation of novel word–object mappings in relation to vocabulary learning at 18, 24, and 30months , 2013, Cognition.

[3]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[4]  Eve V. Clark,et al.  The principle of contrast: A constraint on language acquisition. , 1987 .

[5]  J. Werker,et al.  Monolingual, bilingual, trilingual: infants' language experience influences the development of a word-learning heuristic. , 2009, Developmental science.

[6]  E. Markman,et al.  Use of the mutual exclusivity assumption by young word learners , 2003, Cognitive Psychology.

[7]  C. Mervis,et al.  Two-year-olds readily learn multiple labels for the same basic-level category. , 1994, Child development.

[8]  J. Halberda,et al.  The development of a word-learning strategy , 2003, Cognition.

[9]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[10]  D. Luce,et al.  Detection and Recognition " ' , 2006 .

[11]  Inge-Marie Eigsti,et al.  Mutual exclusivity in autism spectrum disorders: Testing the pragmatic hypothesis , 2011, Cognition.

[12]  S. Carey,et al.  The role of inferences about referential intent in word learning: Evidence from autism , 2005, Cognition.

[13]  Michael C. Frank,et al.  PSYCHOLOGICAL SCIENCE Research Article Using Speakers ’ Referential Intentions to Model Early Cross-Situational Word Learning , 2022 .

[14]  E. Markman,et al.  Children's use of mutual exclusivity to constrain the meanings of words , 1988, Cognitive Psychology.

[15]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[16]  C. Mervis,et al.  Early object labels: the case for a developmental lexical principles framework , 1994, Journal of Child Language.

[17]  L. Markson,et al.  Children's avoidance of lexical overlap: a pragmatic account. , 2001 .

[18]  Michael C. Frank Throwing out the Bayesian baby with the optimal bathwater: Response to Endress (2013) , 2013, Cognition.

[19]  W. Geisler Ideal Observer Analysis , 2002 .