The Taming of Reconcile as a Biomedical Coreference Resolver

To participate in the Protein Coreference section of the BioNLP 2011 Shared Task, we use Reconcile, a coreference resolution engine, by replacing some pre-processing components and adding a new mention detector. We got some improvement from training two separate classifiers for detecting anaphora and antecedent mentions. Our system yielded the highest score in the task, F-score 34.05% in partial mention, protein links, and system recall mode. We witnessed that specialized mention detection is crucial for coreference resolution in the biomedical domain.

[1]  Jian Su,et al.  Improving Noun Phrase Coreference Resolution by Matching Strings , 2004, IJCNLP.

[2]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[3]  Claire Cardie,et al.  Coreference Resolution with Reconcile , 2010, ACL.

[4]  Yannick Versley,et al.  BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[5]  Fernando Pereira,et al.  Identifying gene and protein mentions in text using conditional random fields , 2005, BMC Bioinformatics.

[6]  Jun'ichi Tsujii,et al.  Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data , 2005, HLT.

[7]  Massimo Poesio,et al.  A General-Purpose, Off-the-shelf Anaphora Resolution Module: Implementation and Preliminary Evaluation , 2004, LREC.

[8]  Fredric C. Gey,et al.  Proceedings of LREC , 2010 .

[9]  Jin-Dong Kim,et al.  Exploring Domain Differences for the Design of a Pronoun Resolution System for Biomedical Text , 2008, COLING.

[10]  Jin-Dong Kim,et al.  Overview of the protein coreference task in BioNLP Shared Task 2011 , 2011 .

[11]  Jun'ichi Tsujii,et al.  Challenges in Pronoun Resolution System for Biomedical Text , 2008, LREC.

[12]  Tat-Seng Chua,et al.  A Public Reference Implementation of the RAP Anaphora Resolution Algorithm , 2004, LREC.

[13]  Jian Su,et al.  Recognition of protein/gene names from text using an ensemble of classifiers , 2005, BMC Bioinformatics.

[14]  Claire Cardie,et al.  Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art , 2009, ACL.

[15]  Jian Su,et al.  An NP-Cluster Based Approach to Coreference Resolution , 2004, COLING.

[16]  Malvina Nissim,et al.  Exploring the boundaries: gene and protein identification in biomedical text , 2005, BMC Bioinformatics.