Event and Entity Coreference Across Five Languages: Effects of Context and Referring Expression

Current work on coreference focuses primarily on entities, often leaving unanalysed the use of anaphors to corefer with antecedents such as events and textual segments. Moreover, the anaphoric forms that speakers use for entity and event coreference are not mutually exclusive. This ambiguity has been the subject of work in English, with evidence of a split between comprehenders’ preferential interpretation of personal versus demonstrative pronouns. In addition, comprehenders are shown to be sensitive to antecedent complexity and aspectual status, two verb-driven cues that signal how an event is being portrayed. Here we extend this work via a comparison across five languages (English, French, German, Italian, and Spanish). With a story-continuation experiment, we test how different referring expressions corefer with entity and event antecedents and whether verbal features such as argument structure and aspect influence this choice. Our results show widely consistent, not categorical biases across languages: entity coreference is favoured for personal pronouns and event coreference for demonstratives. Antecedent complexity increases the rate at which anaphors are taken to corefer with an event antecedent, as does portraying an event as completed though the latter does not reach significance. Lastly, we report a comparison of the same referring expressions to refer to entity and event antecedents in a trilingual parallel corpus annotated with coreference. Together, the results provide a first crosslingual picture of coreference preferences beyond the restricted entity-only patterns targeted by most existing work on coreference. The five languages are all shown to allow gradable use of pronouns for entity and event coreference, with biases that align with existing generalizations about the link between prominence and the use of reduced referring expressions. The studies also show the feasibility of manipulating targeted verbdriven cues across multiple languages to support crosslingual comparisons.

[1]  Sharid Loáiciga,et al.  What is it? Disambiguating the different readings of the pronoun 'it' , 2017, EMNLP.

[2]  Anette Frank,et al.  A Mention-Ranking Model for Abstract Anaphora Resolution , 2017, EMNLP.

[3]  Dudenredaktion Duden, die Grammatik : unentbehrlich für richtiges Deutsch , 2005 .

[4]  M. Tanenhaus,et al.  Beyond salience: Interpretation of personal and demonstrative pronouns , 2005 .

[5]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[6]  Mira Ariel Accessibility Marking: Discourse Functions, Discourse Profiles, and Processing Cues , 2004, Discourse Processes.

[7]  F. Cornish Anaphoric Relations in English and French: A Discourse Perspective , 1986 .

[8]  Lyn Frazier,et al.  Null vs. overt pronouns and the Topic-Focus articulation in Spanish: 2704 , 2002 .

[9]  Jinho D. Choi,et al.  QA-It: Classifying Non-Referential It for Question Answer Pairs , 2016, ACL.

[10]  Nicholas Asher,et al.  Reference to abstract objects in discourse , 1993, Studies in linguistics and philosophy.

[11]  Rebecca J. Passonneau,et al.  Getting at Discourse Referents , 1989, ACL.

[12]  T. Givón,et al.  Topic continuity in discourse : a quantitative cross-language study , 1983 .

[13]  Richard Evans,et al.  Applying Machine Learning Toward an Automatic Classification of It , 2001, Lit. Linguistic Comput..

[14]  Christian Hardmeier,et al.  ParCorFull: a Parallel Corpus Annotated with Full Coreference , 2018, LREC.

[15]  Elsi Kaiser,et al.  Demonstratives as bundlers of conceptual structure , 2021, Glossa: a journal of general linguistics.

[16]  Stefanie Dipper,et al.  Anaphora With Non-nominal Antecedents in Computational Linguistics: a Survey , 2018, Computational Linguistics.

[17]  Whitney Gegg-Harrison,et al.  Identifying Non-Referential it: A Machine Learning Approach Incorporating Linguistically Motivated Patterns , 2005, ACL 2005.

[18]  A. Kehler,et al.  Verb aspect, event structure, and coreferential processing. , 2009, Journal of memory and language.

[19]  M. Haspelmath,et al.  More on the typology of inchoative/causative verb alternations , 1993 .

[20]  Yannick Versley,et al.  Challenges and Directions of Further Research , 2016, Anaphora Resolution - Algorithms, Resources, and Applications.

[21]  O. Jespersen A modern English grammar on historical principles , 1928 .

[22]  Richard Evans,et al.  Classifying Referential and Non-referential It Using Gaze , 2020, EMNLP.

[23]  Hiromi Azuma A Diachronic View of Pronominal Reference in English , 2008 .

[24]  J. Trueswell,et al.  The role of discourse context in the processing of a flexible word-order language , 2004, Cognition.

[25]  Ron Zacharski,et al.  Directly and Indirectly Anaphoric Demonstrative and Personal Pronouns in Newspaper Articles , .

[26]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[27]  David Yarowsky,et al.  NADA: A Robust System for Non-referential Pronoun Detection , 2011, DAARC.

[28]  Hwee Tou Ng,et al.  Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[29]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[30]  I. Willis Pronouns , 2004, Encyclopedia of Queer Studies in Education.

[31]  Stefanie Dipper,et al.  Abstract Anaphors in German and English , 2011, DAARC.

[32]  Laia Mayol,et al.  Asymmetries between interpretation and production in Catalan pronouns , 2018, Dialogue Discourse.

[33]  Rosemary J. Stevenson,et al.  Thematic roles, focus and the representation of events , 1994 .

[34]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[35]  Christoph Müller Resolving It, This, and That in Unrestricted Multi-Party Dialog , 2007, ACL.

[36]  Patrick Sturt,et al.  Deixis: This and That in Written Narrative Discourse , 2014 .

[37]  Maria Nella Carminati,et al.  The processing of Italian subject pronouns , 2002 .

[38]  D. Bates,et al.  Parsimonious Mixed Models , 2015, 1506.04967.

[39]  Mark Steedman,et al.  Temporal Ontology and Temporal Reference , 1988, CL.

[40]  Bernard Comrie,et al.  Pragmatic Binding: Demonstratives as Anaphors in Dutch , 1997 .

[41]  Maurice Grevisse,et al.  Le Bon Usage: Grammaire Francaise , 1984 .

[42]  K. McRae,et al.  Integrating Verbs, Situation Schemas, and Thematic Role Concepts , 2001 .

[43]  Graeme Hirst,et al.  Interpreting Anaphoric Shell Nouns using Antecedents of Cataphoric Shell Nouns as Training Data , 2013, EMNLP.

[44]  Caroline Gasperin,et al.  COREFERENCE AND ANAPHORIC RELATIONS OF DEMONSTRATIVE NOUN PHRASES IN MULTILINGUAL CORPUS , 2005 .

[45]  R. Ratcliff,et al.  Pronoun resolution and discourse models. , 1992, Journal of experimental psychology. Learning, memory, and cognition.

[46]  Asociación de Academias de la Lengua Española Nueva gramática de la lengua española : manual , 2010 .

[47]  K. Heusinger,et al.  Discourse prominence: Definition and application , 2019, Journal of Pragmatics.

[48]  Ron Zacharski,et al.  Pronouns without NP antecedents: how do we know when a pronoun is referential? , 2005 .

[49]  Marta Kutas,et al.  Verb aspect and the activation of event knowledge. , 2007, Journal of experimental psychology. Learning, memory, and cognition.

[50]  F. Filiaci Null and overt subject biases in Spanish and Italian: a cross-linguistic comparison , 2010 .

[51]  Bonnie L. Webber,et al.  Structure and Ostension in the Interpretation of Discourse Deixis , 1991, ArXiv.

[52]  Jennifer E. Arnold,et al.  The Effect of Thematic Roles on Pronoun Use and Frequency of Reference Continuation , 2001 .

[53]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[54]  Sharid Loáiciga,et al.  Exploiting Cross-Lingual Hints to Discover Event Pronouns , 2020, LREC.

[55]  Sharid Loáiciga,et al.  Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective , 2012, LREC.

[56]  Benjamin Heinzerling,et al.  Revisiting Selectional Preferences for Coreference Resolution , 2017, EMNLP.

[57]  Michael Strube,et al.  Dialogue Acts, Synchronizing Units, and Anaphora Resolution , 2000, J. Semant..

[58]  Patrick Sturt,et al.  Processing of It and This in Written Narrative Discourse , 2016 .

[59]  M. Anderssen,et al.  The form and position of pronominal objects with non-nominal antecedents in Scandinavian and German , 2019, The Journal of Comparative Germanic Linguistics.

[60]  Donna K. Byron,et al.  Resolving Pronominal Reference to Abstract Entities , 2002, ACL.

[61]  Tanja Samardzic,et al.  Dynamics, causation, duration in the predicate-argument structure of verbs : a computational approach based on parallel corpora , 2013 .

[62]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[63]  H. Rohde,et al.  Event versus entity co-reference: Effects of context and form of referring expression , 2018, Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference.

[64]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.