A Morpho-Phonological Learner

There are two main sources by which a child could learn language-particular phonological rules or constraints: segment distribution data and allomorphy. That is, seeing which phones appear in which phonetic contexts can give some initial clues toward the phonotactics of a language, and then seeing how morphemes alternate in phonetic form based on morpho-phonological context can fill in the rest of the picture, with some help from innate biases. This project leaves aside the question of how segment distribution data might be used. It also does not attempt to explain how the actual phonological learning takes place. Instead, we concentrate here on how to determine what morphemes a language has and what allomorphs each morpheme has. Note that at no point do we hypothesize or attempt to derive an abstract underlying form for words or morphemes—we simply collect the related surface forms. The learner will take as its input the same phonological data that a child would get: sounds and their associated meanings. For simplicity’s sake we assume that the learner has already figured out how to break the sound into segments and that the learner can be sure of the associated meanings. Thus, the sound data is provided in the form of a phonetic transcription, and the meanings are provided in the form of a set of semantic and/or grammatical features. For example, a Russian child hearing the nominative singular of “bread” would receive the following information: