Multilingual Coreference Resolution with Harmonized Annotations

In this paper, we present coreference resolution experiments with a newly created multilingual corpus CorefUD (Nedoluzhko et al.,2021). We focus on the following languages: Czech, Russian, Polish, German, Spanish, and Catalan. In addition to monolingual experiments, we combine the training data in multilingual experiments and train two joined models - for Slavic languages and for all the languages together. We rely on an end-to-end deep learning model that we slightly adapted for the CorefUD corpus. Our results show that we can profit from harmonized annotations, and using joined models helps significantly for the languages with smaller training data.

[1]  A. Sboev,et al.  Deep Neural Networks Ensemble with Word Vector Representation Models to Resolve Coreference Resolution in Russian , 2020 .

[2]  Xiaoqiang Luo,et al.  Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation , 2014, ACL.

[3]  Amir Globerson,et al.  Coreference Resolution with Entity Equalization , 2019, ACL.

[4]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[5]  Alexey Sorokin,et al.  Tuning Multilingual Transformers for Language-Specific Named Entity Recognition , 2019, BSNLP@ACL.

[6]  CorefUD 0 . 1 Coreference meets Universal Dependencies – a pilot experiment on harmonizing coreference datasets for 11 languages , 2021 .

[7]  Ankit Srivastava,et al.  Different German and English Coreference Resolution Models for Multi-domain Content Curation Scenarios , 2017, GSCL.

[8]  Omer Levy,et al.  BERT for Coreference Resolution: Baselines and Analysis , 2019, EMNLP/IJCNLP.

[9]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[10]  Liyan Xu,et al.  Revealing the Myth of Higher-Order Inference in Coreference Resolution , 2020, EMNLP.

[11]  Erik Cambria,et al.  Anaphora and Coreference Resolution: A Review , 2018, Inf. Fusion.

[12]  Henrique Lopes Cardoso,et al.  Exploring Spanish Corpora for Portuguese Coreference Resolution , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).

[13]  Yannick Versley,et al.  SemEval-2010 Task 1: Coreference Resolution in Multiple Languages , 2009, *SEMEVAL.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Maciej Ogrodniczuk,et al.  Deep Neural Networks for Coreference Resolution for Polish , 2018, LREC.

[16]  Gourab Kundu,et al.  Neural Cross-Lingual Coreference Resolution And Its Application To Entity Linking , 2018, ACL.

[17]  Ander Soraluze,et al.  Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque , 2019, Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference.

[18]  Michal Novák,et al.  Coreference Resolution System Not Only for Czech , 2017, ITAT.