Evaluation of Coreference Resolution Tools for Polish from the Information Extraction Perspective

In this paper we discuss the performance of existing tools for coreference resolution for Polish from the perspective of information extraction tasks. We take into consideration the source of mentions, i.e., gold standard vs mentions recognized automatically. We evaluate three existing tools, i.e., IKAR, Ruler and Bartek on the KPWr corpus. We show that the widely used metrics for coreference evaluation (B3, MUC, CEAF, BLANC) do not reflect the real performance when dealing with the task of semantic relations recognition between named entities. Thus, we propose a supplementary metric called PARENT, which measures the correctness of linking between referential mentions and named entities.

[1]  Eduard H. Hovy,et al.  BLANC: Implementing the Rand index for coreference evaluation , 2010, Natural Language Engineering.

[2]  Maciej Piasecki,et al.  Approaching plWordNet 2.0 , 2012 .

[3]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[6]  Bartosz Broda,et al.  IKAR: An Improved Kit for Anaphora Resolution for Polish , 2012, COLING.

[7]  Xiaoqiang Luo,et al.  Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation , 2014, ACL.

[8]  Yannick Versley,et al.  BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[9]  Xiaoqiang Luo,et al.  An Extension of BLANC to System Mentions , 2014, ACL.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Michael Strube,et al.  Evaluation Metrics For End-to-End Coreference Resolution Systems , 2010, SIGDIAL Conference.

[12]  Maciej Janicki,et al.  Liner2 - A Customizable Framework for Proper Names Recognition for Polish , 2013, Intelligent Tools for Building a Scientific Information Platform.

[13]  Chen Chen,et al.  Linguistically Aware Coreference Evaluation Metrics , 2013, IJCNLP.

[14]  Mateusz Kopec,et al.  Zero subject detection for Polish , 2014, EACL.

[15]  Marcin Ptak,et al.  Preliminary Study on Automatic Induction of Rules for Recognition of Semantic Relations between Proper Names in Polish Texts , 2012, TSD.

[16]  Mateusz Kopec,et al.  Creating a Coreference Resolution System for Polish , 2012, LREC.

[17]  Adam Radziszewski A Tiered CRF Tagger for Polish , 2013, Intelligent Tools for Building a Scientific Information Platform.

[18]  COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Demonstration Papers, 8-15 December 2012, Mumbai, India , 2012, COLING.

[19]  Adam Kaczmarek,et al.  Heuristic Algorithm for Zero Subject Detection in Polish , 2015, TSD.

[20]  Bartosz Broda,et al.  KPWr: Towards a Free Corpus of Polish , 2012, LREC.

[21]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[22]  Maciej Ogrodniczuk,et al.  Rule-based coreference resolution module for Polish ? , 2014 .

[23]  Gordana IliHolen Critical Reflections on Evaluation Practices in Coreference Resolution , 2013 .

[24]  Don Tuggener Coreference Resolution Evaluation for Higher Level Applications , 2014, EACL.