Non-convergence in the GLA and variation in the CDA

Given a linguistic theory with constraints/parameters/rules of any generality, learners are faced with what Dresher (1999) calls the credit problem. In terms of Optimality Theory (OT: Prince and Smolensky 1993/2004), there is usually more than one constraint that favors an optimal form over any one of its competitors, and that can be ranked by the learner over the constraints preferring the competitor. One of the most attractive properties of the constraint demotion algorithm (CDA: Tesar and Smolensky 1998) is that it finds a ranking that correctly deals with the learning data (if one exists), without directly stipulating a solution to the credit problem (cf., does not share this property: it fails to converge when presented with a set of data that has multiple interacting instantiations of the credit problem. The advantage of the GLA over the CDA is that it handles variation; in the second part of the paper, I sketch an approach to the learning of variation that makes use of the inconsistency detection properties of the CDA (Tesar 1998). A piece of learning data in Optimality Theory consists of an input and the optimal output, or Winner, paired with another output candidate, or Loser (a mark-data pair or M-D pair). Constraints are annotated for whether they prefer the Winner or the Loser (this is an alternative to showing the violation marks incurred by each; see esp. Prince 2003). Using this notation, a simple instance of the credit problem would be as in (1), where two constraints (Con1 and Con3) favor the winner, and Con 2 favors the loser.