Discovering the Lexical Features of a Language

This paper examines the possibility of automatically discovering the lexieal features of a language. There is strong evidence that the set of possible lexical features which can be used in a language is unbounded, and thus not innate. Lakoff [Lakoff 87] describes a language in which the feature -I-woman-or-fire-ordangerons-thing exists. This feature is based upon ancient folklore of the society in which it is used. If the set of possible lexieal features is indeed unbounded, then it cannot be par t of the innate Universal Grammar and must be learned. Even if the set is not unbounded, the child is still left with the challenging task of determining which features are used in her language. If a child does not know a priori what lexical features are used in her language, there are two sources for acquiring this information: semantic and syntactic cues. A learner using semantic cues could recognize that words often refer to objects, actions, and properties, and from this deduce the lexical features: noun, verb and adjective. Pinker [Pinker 89] proposes that a combination of semantic cues and innate semantic primitives could account for the acquisition of verb features. He believes that the child can discover semantic properties of a verb by noticing the types of actions typically taking place when the verb is uttered. Once these properties are known, says Pinker, they can be used to reliably predict the distributional behavior of the verb. However, Gleitman [Gleitman 90] presents evidence that semantic cues axe not sufficient for a child to acquire verb features and believes that the use of this semantic information in conjunction with information about the subcategorization properties of the verb may be sufficient for learning verb features. This paper takes Glei tman's suggestion to the extreme, in hope of determining whether syntactic cues may not just aid in feature discovery, but may be all tha t is necessary. We present evidence for the sufficiency of a strictly syntax-based model for discovering