Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules

Neural models for morphological inflection have recently attained very high results. However, their interpretation remains challenging. Towards this goal, we propose a simple linguistically-motivated variant to the encoder-decoder model with attention. In our model, character-level cross-attention mechanism is complemented with a self-attention module over substrings of the input. We design a novel approach for pattern extraction from attention weights to interpret what the model learn. We apply our methodology to analyze the model’s decisions on three typologically-different languages and find that a) our pattern extraction method applied to cross-attention weights uncovers variation in form of inflection morphemes, b) pattern extraction from self-attention shows triggers for such variation, c) both types of patterns are closely aligned with grammar inflection classes and class assignment criteria, for all three languages. Additionally, we find that the proposed encoder attention component leads to consistent performance improvements over a strong baseline.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[3]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[4]  Yoav Goldberg,et al.  Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.

[5]  Yang Liu,et al.  On Identifiability in Transformers , 2020, ICLR.

[6]  Ryan Cotterell,et al.  CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages , 2017, CoNLL.

[7]  Ryan Cotterell,et al.  Exact Hard Monotonic Attention for Character-Level Transduction , 2019, ACL.

[8]  Omer Levy,et al.  Emergent linguistic structure in artificial neural networks trained by self-supervision , 2020, Proceedings of the National Academy of Sciences.

[9]  Emily M. Bender Linguistic I Ssues in L Anguage Technology Lilt on Achieving and Evaluating Language-independence in Nlp on Achieving and Evaluating Language-independence in Nlp , 2022 .

[10]  Joan L. Bybee Morphology: A study of the relation between meaning and form , 1985 .

[11]  Mikko Kurimo,et al.  Morfessor 2.0: Toolkit for statistical morphological segmentation , 2014, EACL.

[12]  Damaris Nübling Inflectional Morphology , 2020, The Cambridge Handbook of Germanic Linguistics.

[13]  Yoav Goldberg,et al.  Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? , 2020, ACL.

[14]  T. Shopen,et al.  Language typology and syntactic description , 2013 .

[15]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[16]  A. Martinet Éléments de linguistique générale , 1964 .

[17]  Robert Forkel,et al.  The World Atlas of Language Structures Online , 2009 .

[18]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[19]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[20]  Ryan Cotterell,et al.  The CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection , 2018, CoNLL.

[21]  Yonatan Belinkov,et al.  Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.

[22]  Ramón Fernández Astudillo,et al.  From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.

[23]  Nancy Kanwisher,et al.  Artificial Neural Networks Accurately Predict Language Processing in the Brain , 2020 .

[24]  André F. T. Martins,et al.  IT–IST at the SIGMORPHON 2019 Shared Task: Sparse Two-headed Models for Inflection , 2019, Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology.