Effect of Source Language on AMR Structure

The Abstract Meaning Representation (AMR) annotation schema was originally designed for English. But the formalism has since been adapted for annotation in a variety of languages. Meanwhile, cross-lingual parsers have been developed to derive English AMR representations for sentences from other languages—implicitly assuming that English AMR can approximate an interlingua. In this work, we investigate the similarity of AMR annotations in parallel data and how much the language matters in terms of the graph structure. We set out to quantify the effect of sentence language on the structure of the parsed AMR. As a case study, we take parallel AMR annotations from Mandarin Chinese and English AMRs, and replace all Chinese concepts with equivalent English tokens. We then compare the two graphs via the Smatch metric as a measure of structural similarity. We find that source language has a dramatic impact on AMR structure, with Smatch scores below 50% between English and Chinese graphs in our sample—an important reference point for interpreting Smatch scores in cross-lingual AMR parsing.

[1]  G. Eryigit,et al.  Abstract meaning representation of Turkish , 2022, Natural Language Engineering.

[2]  Nathan Schneider,et al.  Spanish Abstract Meaning Representation: Annotation of a General Corpus , 2022, NEJLT.

[3]  James Pustejovsky,et al.  Designing a Uniform Meaning Representation for Natural Language Processing , 2021, KI - Künstliche Intelligenz.

[4]  Nathan Schneider,et al.  Classifying Divergences in Cross-lingual AMR Pairs , 2021, LAW.

[5]  Marco Damonte,et al.  Understanding and generating language with Abstract Meaning Representation , 2020 .

[6]  Jiyoon Han,et al.  Building Korean Abstract Meaning Representation Corpus , 2020, DMR.

[7]  Marco Antonio Sobrevilla Cabezudo,et al.  Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese , 2019, LAW@ACL.

[8]  Gülsen Eryigit,et al.  Towards Turkish Abstract Meaning Representation , 2019, ACL.

[9]  Ha My Linh,et al.  A Case Study on Meaning Representation for Vietnamese , 2019, Proceedings of the First International Workshop on Designing Meaning Representations.

[10]  Thiago A. S. Pardo,et al.  Towards AMR-BR: A SemBank for Brazilian Portuguese Language , 2018, LREC.

[11]  Arantza Díaz de Ilarraza,et al.  Annotating Abstract Meaning Representations for Spanish , 2018, LREC.

[12]  Bin Li,et al.  Annotating the Little Prince with Chinese AMRs , 2016, LAW@ACL.

[13]  Nianwen Xue,et al.  Not an Interlingua, But Close: Comparison of English AMRs to Chinese and Czech , 2014, LREC.

[14]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[15]  Kevin Knight,et al.  Smatch: an Evaluation Metric for Semantic Feature Structures , 2013, ACL.

[16]  Moshe Koppel,et al.  Translationese and Its Dialects , 2011, ACL.

[17]  J. Gerring A case study , 2011, Technology and Society.

[18]  H. Babcock,et al.  Association of a , 1955 .