Not quite there yet: Combining analogical patterns and encoder-decoder networks for cognitively plausible inflection

The paper presents four models submitted to Part 2 of the SIGMORPHON 2021 Shared Task 0, which aims at replicating human judgements on the inflection of nonce lexemes. Our goal is to explore the usefulness of combining pre-compiled analogical patterns with an encoder-decoder architecture. Two models are designed using such patterns either in the input or the output of the network. Two extra models controlled for the role of raw similarity of nonce inflected forms to existing inflected forms in the same paradigm cell, and the role of the type frequency of analogical patterns. Our strategy is entirely endogenous in the sense that the models appealing solely to the data provided by the SIGMORPHON organisers, without using external resources. Our model 2 ranks second among all submitted systems, suggesting that the inclusion of analogical patterns in the network architecture is useful in mimicking speakers’ predictions.

[1]  Adam Albright Gradient phonological acceptability as a grammatical effect , 2007 .

[2]  Olivier Bonami,et al.  Joint predictiveness in inflectional paradigms , 2016 .

[3]  X. LingCharles Learning the past tense of English verbs , 1994 .

[4]  Ryan Cotterell,et al.  UniMorph 2.0: Universal Morphology , 2018, LREC.

[5]  Yulia Tsvetkov,et al.  Morphological Inflection Generation Using Character Sequence to Sequence Learning , 2015, NAACL.

[6]  Yves Lepage,et al.  Analogy and Formal Languages , 2004, FGMOL.

[7]  Robert Malouf,et al.  Abstractive morphological learning with a recurrent neural network , 2017 .

[8]  François Yvon,et al.  An Analogical Learner for Morphological Analysis , 2005, CoNLL.

[9]  Mans Hulden Generalizing Inflection Tables into Paradigms with Finite State Operations , 2014, SIGMORPHON/SIGFSM.

[10]  Sacha Beniamine,et al.  Multiple alignments of inflectional paradigms , 2021, SCIL.

[11]  Yves Lepage,et al.  Lower and higher estimates of the number of “true analogies” between sentences contained in a large multilingual corpus , 2004, COLING.

[12]  Riitta Salmelin,et al.  Using Statistical Models of Morphology in the Search for Optimal Units of Representation in the Human Mental Lexicon , 2018, Cogn. Sci..

[13]  Nabil Hathout,et al.  GLAWI, a free XML-encoded Machine-Readable Dictionary built from the French Wiktionary , 2015 .

[14]  Adam Albright,et al.  Feature-based generalisation as a source of gradient acceptability* , 2009, Phonology.

[15]  Olivier Bonami,et al.  Inferring Inflection Classes with Description Length , 2018, J. Lang. Model..

[16]  Mary E. Beckman,et al.  Phonetic Interpretation Papers in Laboratory Phonology VI: Speech perception, well-formedness and the statistics of the lexicon , 2004 .

[17]  Markus Forsberg,et al.  Paradigm classification in supervised learning of morphology , 2015, HLT-NAACL.

[18]  R. Nosofsky Relations between exemplar-similarity and likelihood models of classification , 1990 .

[19]  Robert Malouf,et al.  Morphological Organization: The Low Conditional Entropy Conjecture , 2013 .

[20]  François Yvon,et al.  Scaling up Analogical Learning , 2008, COLING.

[21]  Ryan Cotterell,et al.  Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate , 2018, TACL.

[22]  Wolfgang Ullrich Wurzel,et al.  Inflectional Morphology and Naturalness , 1989 .

[23]  Alexander K. Petrenko,et al.  Electronic Notes in Theoretical Computer Science , 2009 .

[24]  Adam Albright Islands of Reliability for Regular Morphology: Evidence from Italian , 2003 .

[25]  P. Luce,et al.  Increases in phonotactic probability facilitate spoken nonword repetition. , 2005 .

[26]  C. F. Hockett Problems of Morphemic Analysis , 1947 .

[27]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[28]  Nabil Hathout,et al.  Wiktionnaire's Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary , 2016, LREC.

[29]  S. Goldwater,et al.  Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection , 2019, ACL.

[30]  B. Hayes,et al.  Rules vs. analogy in English past tenses: a computational/experimental study , 2003, Cognition.

[31]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[32]  Nabil Hathout,et al.  Glawinette: a Linguistically Motivated Derivational Description of French Acquired from GLAWI , 2020, LREC.

[33]  Paul Boersma,et al.  Modeling Productivity with the Gradual Learning Algorithm: The Problem of Accidentally Exceptionless Generalizations , 2005 .

[34]  James P. Blevins,et al.  Parts and wholes: Implicative patterns in inflectional paradigms , 2009 .

[35]  Ulrike Hahn,et al.  What makes words sound similar? , 2005, Cognition.

[36]  Yves Lepage,et al.  Solving Analogies on Words: An Algorithm , 1998, COLING-ACL.