Informative communication in word production and word learning

Informative Communication in Word Production and Word Learning Michael C. Frank, Noah D. Goodman, Peter Lai, and Joshua B. Tenenbaum {mcfrank, ndg, peterlai, jbt}@mit.edu Department of Brain and Cognitive Sciences Massachusetts Institute of Technology Abstract Language does not directly code facts about the world. In- stead, speakers and listeners rely on shared assumptions to al- low them to communicate more efficiently. Writers like Grice and Sperber & Wilson have proposed that communication is assumed to be “informative” or “relevant,” but the predictions of these accounts are often informal or post-hoc. Here we pro- pose a formal analogue to these accounts: that communicators choose what they want to say by how informative it would be about their intended meaning. We derive quantitative predic- tions about how this assumption would be used in language production and learning and test these predictions via two ex- periments. This work takes a first step towards formalizing the pragmatic assumptions necessary for effective communication in under-constrained, real-world situations. Keywords: Language acquisition; Bayesian modeling; Com- munication Introduction How does language work to communicate information from one person to another? Perhaps language is simply a code for facts about the world. On this kind of coding view of communication, all the information necessary to understand an utterance is contained within it. Speakers utter linguistic expressions equivalent to their intended meanings and listen- ers simply decode these expressions to recover their content. There are a profusion of examples of language use, however, which can be natural and easy to understand but are not easily explained by a naive coding model: (1) The statement “I ate some of the cookies.” (Intended meaning: I ate some and not all of the cookies). (2) The declaration “No.” (Intended meaning: I can tell you want to pinch him, but don’t do it). (3) The contextual introduction of a new word “Can I have the glorzit?” (Intended meaning: pass me that thing, which happens to be called a “glorzit”). Philosophers and linguists interested in this problem have suggested that language relies on shared assumptions about the nature of the communicative task. Grice (1975) proposed that speakers follow (and are assumed by comprehenders to follow) a set of maxims, such as “be relevant”, or “make your contribution to the conversation as informative as neces- sary.” Sperber & Wilson (1986) have suggested that there is a shared “Principle of Relevance” which underlies communica- tion. Clark (1996) has argued that communication proceeds by reference to a shared “common ground.” Though these proposals differ in their details, they share a basic assumption that communicators are not simply cod- ing and decoding meanings. Instead, listeners are making in- ferences about speakers’ intentions, taking into account the words they utter and the context of their utterances. This kind of intentional inference framework for language seems much more promising for explaining phenomena like (1-3). But although these ideas seem intuitively correct, the difficulty of formalizing notions like “relevance” has largely kept them from making contact with computational theories of language use and acquisition. The goal of this paper is to begin to address this issue by proposing a computational framework for intentional infer- ence. This framework relies on a shared assumption that com- munications are informative given the context. Although the basis of our framework is general, making predictions within it requires a model of the space of possible meanings and how they map to natural language expressions. Thus, in or- der to make a first test of our framework, we study simple games that are similar to the “language games” proposed by Wittgenstein (1953). In the language games we study, the shared task of commu- nicators is to identify an object from a set using one or a few words. This very restricted task allows us to define the possi- ble meanings that communicators entertain. We then use our framework to make predictions about the meaning and use of single words. This move allows us to define an intuitive mapping between words and meanings: that a word stands for the subset of the context it picks out (its extension). Al- though these two simplifications do bring our tasks further away from natural language use, they also allow us to derive strong quantitative predictions from our framework. The outline of the paper is as follows. We first use our framework to derive predictions for speakers and language learners who assume informative communication in an infer- ential framework. We then test our framework as an account of two different kinds of tasks. Experiment 1 examines, in a simple survey task, whether learners who are inferring the meaning of a novel word assume that speakers are being in- formative in choosing the word they produce. Experiment 2 tests whether, in a more naturalistic production task, speak- ers’ word choice is in fact related to the informativeness of the word they pick. Modeling Informative Communication Consider the context in Figure 1, representing the context in a language game. Imagine an English speaker in this game who is told to use a single word to point out the red circle.