论文信息 - Incorporating Connotation of Meaning into Models of Semantic Representation: An Application in Text Corpus Analysis

Incorporating Connotation of Meaning into Models of Semantic Representation: An Application in Text Corpus Analysis

Incorporating Connotation of Meaning into Models of Semantic Representation: An Application in Text Corpus Analysis Shane T. Mueller (smueller@ara.com) Klein Associates Division A. R. A. Inc. 1750 Commerce Center Boulevard North Fairborn, OH 45434 USA Richard M. Shiffrin (shiffrin@indiana.edu) Department of Psychological and Brain Sciences, 1101 E. 10th Street Bloomington, IN 47404 USA Abstract important aspect of our knowledge, for linguistic and non- linguistic stimuli and for extreme and subtle cases. Connotation of meaning is an important aspect of human se- mantic knowledge, and it cannot be captured in simple pro- totype representations of concepts. Yet models of human episodic memory typically rely on prototype representations, as do statistical techniques for extracting meaningful repre- sentations from text corpora (such as LSA). We will demon- strate how REM-II (a model of human episodic and semantic memory) allows connotation of meaning to be represented, and demonstrate that model can be develop and learn reasonable semantic representations by processing the Mindpixel project’s 80,000-statement GAC corpus. The success of the model at developing meaningful and contextual representations from a text corpus provides a demonstration of the importance and utility of our assumptions. Yet many psychological models of knowledge and concept representations and fail to capture connotation. For example, prototype approaches typically consider information to be en- coded as a set of features, and accumulate average or typical feature values across many individual events to form a com- posite, ignoring systematic variation and correlation among features. Such an approach is not unreasonable, because it al- lows a rich composite of central tendency to be formed from a set of noisy individuals. But if there are consistent patterns in the co-occurrence of features, a prototype will not be sen- sitive to them and will not be able to regenerate these distinct contextual representations. A prototype for the concept taxi would be a concept that never occurs in the world: a vehi- cle that is a mixture between a sedan and a compact car in a color somewhere between yellow and green. And consider adding rickshaws, airport shuttles, limousine services, and horse-drawn carriages to the prototype: the result is nearly impossible to imagine. Keywords: episodic memory; semantic memory; text corpus analysis Connotation of meaning has been shown to be important in language learning (Corrigan, 2002), meaning disambigua- tion (e.g., Swinney, 1979) and even latent emotional content (e.g., Cato et al., 2004). As a rough guide to its prevalence in English, the Merriam-Webster’s Collegiate Dictionary, 11th edition (2003) contains 165,000 entries with 225,000 defini- tions. Thus, there are approximately 1.36 meanings for each word, even though homonyms are given distinct entries and the dictionary is likely to contain large numbers of infrequent and specialized terms with only one definition. Connotation of meaning describes the fact that the con- cepts we understand have multiple context-specific forms. If we consider linguistic concepts, extreme versions of connota- tion encompass homophony, homonymy and polysemy: sin- gle word forms sharing multiple distinct meanings. Words exhibiting these properties make connotation a challenge for automated systems attempting to understand language, be- cause the context of the word must be considered in order to understand its proper meaning. But even subtler forms of connotation can be important, and this importance can tran- scend purely linguistic contexts. For example, consider how taxi cabs in different cities and countries differ substantially from one another. In Manhattan, a typical taxi is a yellow four-door sedan built by an American car company; in Mex- ico City, a typical taxi may be a small green compact vehi- cle. Thus, what we are calling connotation of meaning is an Despite the inadequacy of prototype techniques for rep- resenting knowledge, techniques for extracting meaningful representations from text corpora typically use prototypes. For example, HAL (Burgess & Lund, 1997) uses a graded word co-occurrence vector to represent semantic space; LSA (Landauer & Dumais, 1997) uses co-occurrences as input and projects this information onto a lower dimensional space us- ing statistical optimization procedures similar to factor anal- ysis. Likewise, the Topics model (Griffiths & Steyvers, 2004) uses a bayesian approach to place constraints on the statisti- cal distribution taken by features, and as a byproduct gener- ates features that are often interpretable. And recently, Jones and Mewhort (2007) demonstrated that order and meaning can be incorporated into a composite holographic trace using a convolution/correlation process. Of these, only Jones and Mewhort (2007) use a representation of knowledge that is not a simple prototype; instead they use a complex holographic representation in which information is distributed. In order to move beyond a simple prototype knowledge representation, we propose that knowledge accumulates in the form of feature co-occurrences. Thus, if one considers all

Richard M. Shiffrin | Shane T. Mueller

[1] Gary Klein,et al. Making Sense of Sensemaking 2: A Macrocognitive Model , 2006, IEEE Intelligent Systems.

[2] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[3] R. Shiffrin,et al. A model for recognition memory: REM—retrieving effectively from memory , 1997, Psychonomic bulletin & review.

[4] Roberta Corrigan. The acquisition of word connotations: asking 'what happened?'. , 2004, Journal of child language.

[5] Merriam Webster. Merriam-Webster's Collegiate Dictionary , 2016 .

[6] D. Swinney. Lexical access during sentence comprehension: (Re)consideration of context effects , 1979 .

[7] I. Fischler,et al. Processing Words with Emotional Connotation: An fMRI Study of Time Course and Laterality in Rostral Frontal and Retrosplenial Cortices , 2004, Journal of Cognitive Neuroscience.

[8] Curt Burgess,et al. Modelling Parsing Constraints with High-dimensional Context Space , 1997 .

[9] Mark Steyvers,et al. Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10] Victor Smetacek,et al. Making sense , 2004, Nature.

[11] Michael N Jones,et al. Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[12] Shane T. Mueller,et al. REM-II : A Model of the Developmental Co-Evolution of Episodic Memory and Semantic Knowledge , 2006 .