Adapting Coreference Resolution for Processing Violent Death Narratives

Coreference resolution is an important compo-nent in analyzing narrative text from admin-istrative data (e.g., clinical or police sources).However, existing coreference models trainedon general language corpora suffer from poortransferability due to domain gaps, especiallywhen they are applied to gender-inclusive datawith lesbian, gay, bisexual, and transgender(LGBT) individuals.In this paper, we an-alyzed the challenges of coreference resolu-tion in an exemplary form of administrativetext written in English: violent death nar-ratives from the USA’s Centers for DiseaseControl’s (CDC) National Violent Death Re-porting System. We developed a set of dataaugmentation rules to improve model perfor-mance using a probabilistic data programmingframework. Experiments on narratives froman administrative database, as well as existinggender-inclusive coreference datasets, demon-strate the effectiveness of data augmentationin training coreference models that can betterhandle text data about LGBT individuals.

[1]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[2]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[3]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[4]  Michael Strube,et al.  Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link-based Entity Aware Metric , 2016, ACL.

[5]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[6]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[7]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[8]  Jason Baldridge,et al.  Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns , 2018, TACL.

[9]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[10]  Rachel Rudinger,et al.  Gender Bias in Coreference Resolution , 2018, NAACL.

[11]  Luke S. Zettlemoyer,et al.  Higher-Order Coreference Resolution with Coarse-to-Fine Inference , 2018, NAACL.

[12]  Nuno M. Fonseca Ferreira,et al.  A Review on Relations Extraction in Police Reports , 2019, WorldCIST.

[13]  T. Heeren,et al.  Alcohol Policies and Alcohol Involvement in Intimate Partner Homicide in the U.S. , 2019, American journal of preventive medicine.

[14]  Ryan Cotterell,et al.  Gender Bias in Contextualized Word Embeddings , 2019, NAACL.

[15]  Jeanna Neefe Matthews,et al.  Quantifying Gender Bias in Different Corpora , 2020, WWW.

[16]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[17]  Yang Trista Cao,et al.  Toward Gender-Inclusive Coreference Resolution , 2019, ACL.

[18]  Geoffrey L. Ream,et al.  An Investigation of the LGBTQ+ Youth Suicide Disparity Using National Violent Death Reporting System Narrative Data. , 2020, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[19]  S. Cochran,et al.  Prevalence of Bullying Among Youth Classified as LGBTQ Who Died by Suicide as Reported in the National Violent Death Reporting System, 2003-2017. , 2020, JAMA pediatrics.