Neural Graphical Models over Strings for Principal Parts Morphological Paradigm Completion

Many of the world’s languages contain an abundance of inflected forms for each lexeme. A critical task in processing such languages is predicting these inflected forms. We develop a novel statistical model for the problem, drawing on graphical modeling techniques and recent advances in deep learning. We derive a Metropolis-Hastings algorithm to jointly decode the model. Our Bayesian network draws inspiration from principal parts morphological analysis. We demonstrate improvements on 5 languages.

[1]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2]  Gerard de Melo,et al.  Morphological Segmentation with Window LSTM Neural Networks , 2016, AAAI.

[3]  John DeNero,et al.  Supervised Learning of Complete Morphological Paradigms , 2013, NAACL.

[4]  Ryan Cotterell,et al.  Penalized Expectation Propagation for Graphical Models over Strings , 2015, NAACL.

[5]  Ryan Cotterell,et al.  Modeling Word Forms Using Latent Underlying Morphs and Phonology , 2015, TACL.

[6]  Mans Hulden Generalizing Inflection Tables into Paradigms with Finite State Operations , 2014, SIGMORPHON/SIGFSM.

[7]  Markus Dreyer,et al.  Graphical Models over Multiple Strings , 2009, EMNLP.

[8]  Yonatan Belinkov,et al.  Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection , 2016, SIGMORPHON.

[9]  Markus Forsberg,et al.  Paradigm classification in supervised learning of morphology , 2015, HLT-NAACL.

[10]  Grzegorz Kondrak,et al.  Inflection Generation as Discriminative String Transduction , 2015, HLT-NAACL.

[11]  Raphael A. Finkel,et al.  Principal parts and degrees of paradigmatic transparency , 2009 .

[12]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[13]  Josef van Genabith,et al.  Neural Morphological Tagging from Characters for Morphologically Rich Languages , 2016, ArXiv.

[14]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[15]  Raphael A. Finkel,et al.  What Your Teacher Told You is True: Latin Verbs Have Four Principal Parts , 2009, Digit. Humanit. Q..

[16]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[17]  Noah D. Goodman,et al.  Learning Stochastic Inverses , 2013, NIPS.

[18]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[19]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[20]  Christo Kirov,et al.  A Language-Independent Feature Schema for Inflectional Morphology , 2015, ACL.

[21]  Phil Blunsom,et al.  Compositional Morphology for Word Representations and Language Modelling , 2014, ICML.

[22]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[23]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[26]  Raphael A. Finkel,et al.  Principal parts and morphological typology , 2007 .

[27]  Markus Dreyer,et al.  Latent-Variable Modeling of String Transductions with Finite-State Methods , 2008, EMNLP.

[28]  Ryan Cotterell,et al.  The SIGMORPHON 2016 Shared Task—Morphological Reinflection , 2016, SIGMORPHON.

[29]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[30]  Ryan Cotterell,et al.  Neural Morphological Analysis: Encoding-Decoding Canonical Segments , 2016, EMNLP.

[31]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[32]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[33]  Yoav Goldberg,et al.  Sequence to Sequence Transduction with Hard Monotonic Attention , 2016, ArXiv.

[34]  Ryan Cotterell,et al.  Dual Decomposition Inference for Graphical Models over Strings , 2015, EMNLP.

[35]  Yulia Tsvetkov,et al.  Morphological Inflection Generation Using Character Sequence to Sequence Learning , 2015, NAACL.

[36]  Katharina Kann,et al.  MED: The LMU System for the SIGMORPHON 2016 Shared Task on Morphological Reinflection , 2016, SIGMORPHON.

[37]  Ryan Cotterell,et al.  Joint Semantic Synthesis and Morphological Analysis of the Derived Word , 2017, TACL.

[38]  Josef van Genabith,et al.  Learning Morphology with Morfette , 2008, LREC.