论文信息 - Stochastic Approaches to MorphologyAcquisition

Stochastic Approaches to MorphologyAcquisition

One of the first steps in acquiring a morphology system is discovering which phonetic strings correspond to morphemes. These phonetic strings can then be further analyzed in order to determine their grammatical privileges and contribution to meaning and thus to bootstrap into a functional morphology system. Discovering the relevant phonetic strings is a deceptively easy task. Morpheme discovery presents a number of difficulties that are above and beyond those that occur for the similar task of word discovery and segmentation. Although both require the segmenting of a continuous speech stream, word segmentation can take advantage of the fact that some words are spoken in isolation, and those words can be used to bootstrap into the segmentation of other words. Although this will work for some morphemes (many words are monomorphemic), grammatical morphemes are often bound in many languages, such as English and Spanish, and thus never heard in isolation. Additionally, there is no simple strategy that will universally work for breaking a word into its component morphemes. Although in many languages grammatical morphemes are either at the beginning or the end of a word, simply using an approach whereby the child assumes that the first or last syllable is a morpheme will only work if that assumption aligns with the language environment that the child is exposed to. Since affixing languages of the world can have (multiple) prefixes, suffixes, and infixes, such an approach is likely to fail. Additionally, acquiring grammatical morphemes is much like acquiring function words; unlike nouns, function words have little concrete semantic meaning, likely contributing to the difficulty in learning these types of words (Bird et al. 2001, Caselli et al. 1995, Gentner 1982, Morrison et al. 1997). The search for morpheme forms does have the advantage that a given morpheme generally occurs within certain syntactic environments (e.g., the morpheme –ing in English generally occurs with verbs). Although it has been noted that morphology can help a child acquire syntax (Morgan et al. 1987), the reverse may also be true. The relationship between morphology and syntax could be beneficial both for discovering bound morphemes and for knowing which words a given bound morpheme can attach to. For instance, -ing might be more readily detected as a suffix when only examining verbs than when examining all words. Additionally, once a child has discovered that –ing can be applied to a particular verb, extending that ending only to other verbs will greatly reduce overgeneralization errors. There is a long history of research for morphology discovery models (e.g., Brent & Cartwright 1996, Goldsmith 2001, Harris 1955). Many of these systems, such as that by Erjavec and Džeroski (2004) are not designed to model child language acquisition, but rather are designed for computational tasks such as parsing a database. Because we are interested in how children acquire morphological forms, only models of language learning will be discussed here. In order to model acquisition of morphological forms by children, an automatic morphology discovery system must have the following characteristics. First, since morphemes must be acquired by the child (i.e., they are highly language specific and thus cannot be innate), any morphology discovery system must use a plausible learning mechanism. This entails not only using information available to the language learner, but also using mechanisms that children possess. Second, because morphemes can appear as (multiple) prefixes, suffixes, and infixes in affixing languages, any morpheme discovery system must have flexibility in terms of the position in the word where the morpheme occurs. Third, it must generate a robust list of morphemes which is minimally sufficient to allow the child to bootstrap into the rest of the morphological system. Finally, given that grammatical morphemes generally occur

Toben H. Mintz | Justin M. Aronoff | Nuria Giralt | Nuria Giralt