Parallel distributed processing models and metaphors for language and development

Parallel distributed processing (PDP) is an approach to modeling cognitive abilities, and a hypothesis about the mechanisms mediating learning and representation. This dissertation focuses on one type of PDP model, networks of simple processing units trained by the back propagation algorithm to learn input-output relationships. Two areas of inquiry are investigated: cognitive development, and representation of linguistic regularities, associations, such as those governing constraints on word-combinations. In Part I, artificial training corpora are constructed to explore how constraints on the order of acquisition of different types of information may be a function of the learning algorithm and superpositional storage of input-output pairs. The principles that constrain network performance are compared to three complimentary approaches to development: maturational, constructivist, and correlational. The exploratory models show that PDP mechanisms are relevant to all three approaches. Networks can be used to illustrate the notions of logical prerequisite and critical period, and can recover from initial incorrect encodings of a training corpus, such as those that derive from imperfect correlations between aspects of the input and output. These findings are argued to be relevant to the question of what computational mechanisms and representations may facilitate inductive learning. Part II of the dissertation focuses on the potential of PDP models to capture linguistic regularities. To investigate an analogy between network and human representations of word meaning and word-combination rules, the polysemic structure of six prepositions were analyzed. Two PDP models meeting some of these representational challenges were constructed. The training sets for both models consist of distributions of form-meaning pairs that reflect important aspects of English spatial expressions. Under pressure to encode these distributional regularities with limited resources, the models self-organize to encode the abstract properties of the input vectors and "rules" for their combination. Key to the models' successful integration of words in different contexts is their recoding of input vectors into a format in which co-occurrence relationships among words are explicit. Given that speakers must solve the same problem solved by the networks, human mental representations of word meanings may also conflate attributes of individual words and the contexts in which they occur.