Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

In this paper, we present a cross-lingual neural coreference resolution system for a lessresourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.

[1]  Xabier Arregi,et al.  Coreference Resolution for the Basque Language with BART , 2016, CORBON@HLT-NAACL.

[2]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Xiaoqiang Luo,et al.  Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation , 2014, ACL.

[5]  Michael Strube,et al.  Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link-based Entity Aware Metric , 2016, ACL.

[6]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[7]  Itziar Aduriz,et al.  Coreferential Relations in Basque: The Annotation Process , 2018, Journal of psycholinguistic research.

[8]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Xabier Arregi,et al.  Improving mention detection for Basque based on a deep error analysis , 2016, Natural Language Engineering.

[12]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[13]  Andrei Popescu-Belis,et al.  Using Coreference Links to Improve Spanish-to-English Machine Translation , 2017 .

[14]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[15]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[16]  Alexander M. Rush,et al.  Learning Global Features for Coreference Resolution , 2016, NAACL.

[17]  M. R E C A S E,et al.  BLANC: Implementing the Rand index for coreference evaluation , 2010, Natural Language Engineering.

[18]  PoesioMassimo,et al.  Two uses of anaphora resolution in summarization , 2007 .

[19]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[20]  Christopher D. Manning,et al.  Improving Coreference Resolution by Learning Entity-Level Distributed Representations , 2016, ACL.

[21]  N. Nicolov,et al.  Sentiment Analysis : Does Coreference Matter ? , 2008 .

[22]  Kyoung-Ho Choi,et al.  Korean Coreference Resolution with Guided Mention Pair Model Using Deep Learning , 2016 .

[23]  Yannick Versley,et al.  BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[24]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[25]  Pascal Denis,et al.  Global joint models for coreference resolution and named entity classification , 2009, Proces. del Leng. Natural.

[26]  Luke S. Zettlemoyer,et al.  Higher-Order Coreference Resolution with Coarse-to-Fine Inference , 2018, NAACL.

[27]  Jason Weston,et al.  Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[28]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[29]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[30]  Henrique Lopes Cardoso,et al.  Exploring Spanish Corpora for Portuguese Coreference Resolution , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).

[31]  José L. Vicedo,et al.  Coreference In Q&A , 2008 .

[32]  Xabier Arregi,et al.  Coreference Resolution for Morphologically Rich Languages. Adaptation of the Stanford System to Basque , 2015, Proces. del Leng. Natural.

[33]  Gourab Kundu,et al.  Neural Cross-Lingual Coreference Resolution And Its Application To Entity Linking , 2018, ACL.

[34]  Maciej Ogrodniczuk,et al.  Deep Neural Networks for Coreference Resolution for Polish , 2018, LREC.