Probabilistic Learning Algorithms and Optimality Theory

This article provides a critical assessment of the Gradual Learning Algorithm (GLA) for probabilistic optimality-theoretic (OT) grammars proposed by Boersma and Hayes (2001). We discuss the limitations of a standard algorithm for OT learning and outline how the GLA attempts to overcome these limitations. We point out a number of serious shortcomings with the GLA: (a) A methodological problem is that the GLA has not been tested on unseen data, which is standard practice in computational language learning. (b) We provide counterexamples, that is, attested data sets that the GLA is not able to learn. (c) Essential algorithmic properties of the GLA (correctness and convergence) have not been proven formally. (d) By modeling frequency distributions in the grammar, the GLA conflates the notions of competence and performance. This leads to serious conceptual problems, as OT crucially relies on the competence/performance distinction.

[1]  Paola Merlo,et al.  A corpus-based analysis of verb continuation frequencies for syntactic processing , 1994 .

[2]  A. Sorace Gradients in Auxiliary Selection with Intransitive Verbs. , 2000 .

[3]  Gregory R. Guy Violable is variable: Optimality theory and linguistic variation , 1997, Language Variation and Change.

[4]  P. Boersma How we learn variation, optionality and probalility , 1997 .

[5]  Peter Sells,et al.  Formal and empirical issues in optimality theoretic syntax , 2001 .

[6]  Matthew W. Crocker,et al.  Ambiguity Resolution in Sentence Processing: Evidence against Frequency-Based Accounts , 2000 .

[7]  Frank Keller,et al.  Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality , 2001 .

[8]  A. Sorace,et al.  MAGNITUDE ESTIMATION OF LINGUISTIC ACCEPTABILITY , 1996 .

[9]  Daniel Jurafsky,et al.  How Verb Subcategorization Frequencies Are Affected By Corpus Choice , 1998, COLING.

[10]  Hans Uszkoreit,et al.  Word Order and Constituent Structure in German , 1987, CSLI Lecture Notes.

[11]  Charles Reiss,et al.  Formal and Empirical Arguments concerning Phonological Acquisition , 1998, Linguistic Inquiry.

[12]  P. Boersma,et al.  Empirical Tests of the Gradual Learning Algorithm , 2001, Linguistic Inquiry.

[13]  Bruce Hayes,et al.  Gradient Well-Formedness in Optimality Theory , 2000 .

[14]  Sten Vikner,et al.  Optimality-theoretic syntax , 2001 .

[15]  A. Sorace Unaccusativity and auxiliary choice in non-native grammars of Italian and French: asymmetries and predictable indeterminacy , 1993, Journal of French Language Studies.

[16]  Christopher D. Manning,et al.  Soft Constraints Mirror Hard Constraints: Voice and Person in English and Lummi , 2002 .

[17]  Paul Boersma,et al.  Learning a grammar in Functional Phonology , 2000 .

[18]  Judith Aissen,et al.  Markedness and Subject Choice in Optimality Theory , 1999 .

[19]  Shipra Dingare,et al.  The Effect of Feature Hierarchies on Frequencies of Passivization in English , 2001 .

[20]  Paul Boersma,et al.  Gradual constraint-ranking learning algorithm predicts acquisition order , 1999 .

[21]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[22]  Hye-Won Choi,et al.  Optimizing structure in context: scrambling and information structure , 1996 .

[23]  Ash Asudeh,et al.  Linking, Optionality, and Ambiguity in Marathi , 2000 .

[24]  Ash Asudeh,et al.  Constraints on Linguistic Coreference: Structural vs. Pragmatic Factors , 2001 .

[25]  Bruce Tesar,et al.  Learnability in Optimality Theory (long version) , 1996 .

[26]  Gregory R. Guy,et al.  Inherent variability and the obligatory contour principle , 1997, Language Variation and Change.

[27]  Bruce Hayes,et al.  Reduplication and syllabification in Ilokano , 1989 .

[28]  Mirella Lapata,et al.  A Probabilistic Account of Logical Metonymy , 2003, Computational Linguistics.

[29]  A. Sorace Incomplete vs. divergent representations of unaccusativity in non native grammars of Italian , 1993 .

[30]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[31]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[32]  Carson T. Schütze The empirical base of linguistics: Grammaticality judgments and linguistic methodology , 1998 .

[33]  Christopher Culy,et al.  Statistical Distribution and the Grammatical/Ungrammatical Distinction , 1998, Grammars.

[34]  Wayne Cowart,et al.  Experimental Syntax: Applying Objective Methods to Sentence Judgments , 1997 .

[35]  Paul Boersma,et al.  Optimality-Theoretic learning in the Praat program , 1999 .

[36]  Noam Chomsky,et al.  The Minimalist Program , 1992 .

[37]  Steven Abney,et al.  Statistical Methods and Linguistics , 2002 .

[38]  William J. Turkel Learning Phonology: Genetic Algorithms and Yoruba Tongue Root Harmony , 2000 .

[39]  Jeroen van de Weijer,et al.  Optimality theory : phonology, syntax, and acquisition , 2000 .

[40]  Gert Westermann,et al.  Emergent modularity and U-shaped learning in a constructivist neural network learning the English past tense , 1998 .

[41]  F. Hinskens,et al.  Variation, change and phonological theory , 1997 .

[42]  G. Müller Optimality, markedness, and word order in German , 1999 .

[43]  M. Tanenhaus,et al.  Verb-specific constraints in sentence processing: Separating effects of lexical preference from garden-paths. , 1993 .

[44]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[45]  Wayne Cowart,et al.  Experimental evidence for a minimalist account of English resumptive pronouns , 1999, Cognition.

[46]  Andrew Koontz-Garboden,et al.  A stochastic OT approach to word order variation in Korlai Portuguese , 2001 .

[47]  J. Bresnan Lexical-Functional Syntax , 2000 .

[48]  Frank Keller,et al.  Verb Frame Frequency as a Predictor of Verb Bias , 2001, Journal of psycholinguistic research.

[49]  Arto Anttila,et al.  Variation in Finnish phonology and morphology , 1997 .

[50]  Peter Broeder,et al.  Models of Language Acquisition: Inductive and Deductive Approaches , 2001 .

[51]  Frank Keller Evaluating Competition-based Models of Word Order , 2000 .

[52]  A. Anttila Deriving variation from grammar: A study of Finnish genitives , 2000 .

[53]  Bruce P. Hayes,et al.  Quatrain form in English folk verse , 1998 .

[54]  Frank Keller,et al.  Phonology competes with syntax: experimental evidence for the interaction of word order and accent placement in the realization of Information Structure , 2001, Cognition.

[55]  Judith L. Klavans,et al.  Book Reviews: The Balancing Act: Combining Symbolic and Statistical Approaches to Language , 1997, CL.

[56]  Christopher T. Kello,et al.  Verb-specific constraints in sentence processing: separating effects of lexical preference from garden-paths. , 1993, Journal of experimental psychology. Learning, memory, and cognition.

[57]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.