Exact Hard Monotonic Attention for Character-Level Transduction

Many common character-level, string-to-string transduction tasks, e.g., grapheme-to-phoneme conversion and morphological inflection, consist almost exclusively of monotonic transduction. Neural sequence-to-sequence models with soft attention, non-monotonic models, outperform popular monotonic models. In this work, we ask the following question: Is monotonicity really a helpful inductive bias in these tasks? We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns alignment jointly. With the help of dynamic programming, we are able to compute the exact marginalization over all alignments. Our models achieve state-of-the-art performance on morphological inflection. Furthermore, we find strong performance on two other character-level transduction tasks.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Ryan Cotterell,et al.  Hard Non-Monotonic Attention for Character-Level Transduction , 2018, EMNLP.

[3]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Alex Graves,et al.  Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.

[6]  Katsuhito Sudoh,et al.  Noise-Aware Character Alignment for Bootstrapping Statistical Machine Transliteration from Bilingual Corpora , 2013, EMNLP.

[7]  Ryan Cotterell,et al.  The SIGMORPHON 2016 Shared Task—Morphological Reinflection , 2016, SIGMORPHON.

[8]  Yoav Goldberg,et al.  Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.

[9]  Simon Clematide,et al.  Align and Copy: UZH at SIGMORPHON 2017 Shared Task for Morphological Reinflection , 2017, CoNLL.

[10]  Hermann Ney,et al.  Neural Hidden Markov Model for Machine Translation , 2018, ACL.

[11]  Lei Yu,et al.  Online Segment to Segment Neural Transduction , 2016, EMNLP.

[12]  Min Zhang,et al.  Whitepaper of NEWS 2015 Shared Task on Machine Transliteration , 2015 .

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Simon Clematide,et al.  Imitation Learning for Neural Morphological String Transduction , 2018, EMNLP.

[15]  Ryan Cotterell,et al.  Weighting Finite-State Transductions With Neural Context , 2016, NAACL.

[16]  Colin Raffel,et al.  Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.

[17]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[18]  Ling Liu,et al.  Data Augmentation for Morphological Reinflection , 2017, CoNLL.

[19]  Thomas Breuel,et al.  Sequence-to-sequence neural network models for transliteration , 2016, ArXiv.

[20]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[21]  Katharina Kann,et al.  Training Data Augmentation for Low-Resource Morphological Inflection , 2017, CoNLL.

[22]  Ryan Cotterell,et al.  CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages , 2017, CoNLL.

[23]  Graham Neubig,et al.  Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction , 2017, ACL.

[24]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..