Induced lexical categories enhance cross-situational learning of word meanings
暂无分享,去创建一个
In this paper we bring together two sources of information that have been proposed as clues used by children acquiring word meanings. One mechanism is cross-situational learning which exploits co-occurrences between words and their referents in perceptual context accompanying utterances. The other mechanism is distributional semantics where meanings are based on word-word co-occurrences. We propose an integrated incremental model which learns lexical categories from linguistic input as well as word meanings from simulated cross-situational data. The co-occurrence statistics between the learned categories and the perceptual context enable the cross-situational word learning mechanism to form generalizations across word forms. Through a number of experiments we show that our automatically and incrementally induced categories significantly improve the performance of the word learning model, and are closely comparable to a set of gold-standard, manually-annotated part of speech tags. We perform further analyses to examine the impact of various factors, such as word frequency and class granularity, on the performance of the hybrid model of word and category learning. Furthermore, we simulate guessing the most probable semantic features for a novel word from its sentential context in the absence of perceptual cues, an ability which is beyond the reach of a pure cross-situational learner.