Are You Sure That This Happened? Assessing the Factuality Degree of Events in Text

Identifying the veracity, or factuality, of event mentions in text is fundamental for reasoning about eventualities in discourse. Inferences derived from events judged as not having happened, or as being only possible, are different from those derived from events evaluated as factual. Event factuality involves two separate levels of information. On the one hand, it deals with polarity, which distinguishes between positive and negative instantiations of events. On the other, it has to do with degrees of certainty (e.g., possible, probable), an information level generally subsumed under the category of epistemic modality. This article aims at contributing to a better understanding of how event factuality is articulated in natural language. For that purpose, we put forward a linguistic-oriented computational model which has at its core an algorithm articulating the effect of factuality relations across levels of syntactic embedding. As a proof of concept, this model has been implemented in De Facto, a factuality profiler for eventualities mentioned in text, and tested against a corpus built specifically for the task, yielding an F1 of 0.70 (macro-averaging) and 0.80 (micro-averaging). These two measures mutually compensate for an over-emphasis present in the other (either on the lesser or greater populated categories), and can therefore be interpreted as the lower and upper bounds of the De Facto's performance.

[1]  E. H. Hutten,et al.  SEMANTICS , 1953, The British Journal for the Philosophy of Science.

[2]  Karl Erich Heidolph,et al.  Progress in linguistics : a collection of papers , 1970 .

[3]  Laurence R. Horn,et al.  On the semantic properties of logical operators in english' reproduced by the indiana university lin , 1972 .

[4]  George Lakoff,et al.  Hedges: A study in meaning criteria and the logic of fuzzy concepts , 1973, J. Philos. Log..

[5]  J. Hooper On Assertive Predicates , 1975 .

[6]  K. Bach,et al.  Linguistic Communication and Speech Acts , 1983 .

[7]  Michael Halliday,et al.  An Introduction to Functional Grammar , 1985 .

[8]  F. Palmer,et al.  Mood and modality , 1986 .

[9]  Johanna Nichols,et al.  Evidentiality: The Linguistic Coding of Epistemology , 1986 .

[10]  F. Kiefer ON DEFINING MODALITY , 1987 .

[11]  D. Biber,et al.  Styles of stance in English: Lexical and grammatical marking of evidentiality and affect , 1989 .

[12]  Laurence R. Horn A Natural History of Negation , 1989 .

[13]  Arnim von Stechow,et al.  Semantik: Ein Internationales Handbuch Der Zeitgenössischen Forschung , 1991 .

[14]  T. Givón,et al.  English grammar : a function-based introduction , 1995 .

[15]  Daniel Dor,et al.  Representations, attitudes and factivity evaluations : an epistemically-based analysis of lexical selection , 1995 .

[16]  J. Terra David Medalla in London , 1995 .

[17]  K. Hyland,et al.  Writing Without Conviction? Hedging in Science Research Articles , 1996 .

[18]  Randy J. LaPolla,et al.  Syntax: Structure, Meaning, and Function , 1999 .

[19]  B. Geurts Presuppositions and Anaphors in Attitude Contexts , 1998 .

[20]  Yuji Matsumoto,et al.  Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.

[21]  Ilana Mushin Evidentiality and epistemological stance , 2001 .

[22]  Michael Glanzberg Felicity and Presupposition Triggers , 2003 .

[23]  Padmini Srinivasan,et al.  The Language of Bioscience: Facts, Speculations, and Statements In Between , 2004, HLT-NAACL 2004.

[24]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[25]  James Pustejovsky,et al.  Temporal and Event Information in Natural Language Text , 2005, Lang. Resour. Evaluation.

[26]  Jeannett Martin,et al.  The Language of Evaluation: Appraisal in English , 2005 .

[27]  Vassiliki Rizomilioti Exploring Epistemic Modality in Academic Discourse Using Corpora , 2006 .

[28]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[29]  Victoria L. Rubin Identifying certainty in texts , 2006 .

[30]  Alan Lee,et al.  Attribution and its annotation in the Penn Discourse TreeBank , 2006, Trait. Autom. des Langues.

[31]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[32]  Hagit Shatkay,et al.  New directions in biomedical text annotation: definitions, guidelines and corpus construction , 2006, BMC Bioinformatics.

[33]  James Pustejovsky,et al.  SlinkET: A Partial Modal Parser for Events , 2006, LREC.

[34]  C. Condoravdi,et al.  Computing relative polarity for textual inference , 2006 .

[35]  Victoria L. Rubin Stating with Certainty or Stating with Doubt: Intercoder Reliability Results for Manual Annotation of Epistemically Modalized Statements , 2007, NAACL.

[36]  James Pustejovsky,et al.  Determining Modality and Factuality for Text Entailment , 2007, International Conference on Semantic Computing (ICSC 2007).

[37]  Ted Briscoe,et al.  Weakly Supervised Learning for Hedge Classification in Scientific Literature , 2007, ACL.

[38]  Andrew Hickl,et al.  A Discourse Commitment-Based Framework for Recognizing Textual Entailment , 2007, ACL-PASCAL@ACL.

[39]  James Pustejovsky,et al.  Determining Modality and Factuality for Text Entailment , 2007 .

[40]  Karo Moilanen,et al.  Sentiment Composition , 2007 .

[41]  Jun'ichi Tsujii,et al.  Corpus annotation for mining biomedical events from literature , 2008, BMC Bioinformatics.

[42]  Hagit Shatkay,et al.  Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users , 2008, Bioinform..

[43]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[44]  Halil Kilicoglu,et al.  Recognizing speculative language in biomedical research articles: a linguistically motivated perspective , 2008, BMC Bioinformatics.

[45]  James Pustejovsky,et al.  A factuality profiler for eventualities in text , 2008 .

[46]  György Szarvas,et al.  Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords , 2008, ACL.

[47]  James Pustejovsky,et al.  FactBank: a corpus annotated with event factuality , 2009, Lang. Resour. Evaluation.

[48]  Dragomir R. Radev,et al.  Detecting Speculations and their Scopes in Scientific Text , 2009, EMNLP.

[49]  Mitsuru Ishizuka,et al.  Compositionality Principle in Recognition of Fine-Grained Emotions from Text , 2009, ICWSM.

[50]  Roser Morante,et al.  A Metalearning Approach to Processing the Scope of Negation , 2009, CoNLL.

[51]  Sampo Pyysalo,et al.  Overview of BioNLP’09 Shared Task on Event Extraction , 2009, BioNLP@HLT-NAACL.

[52]  Weiwei Guo,et al.  Committed Belief Annotation and Tagging , 2009, Linguistic Annotation Workshop.

[53]  Roser Morante,et al.  Learning the Scope of Hedge Cues in Biomedical Texts , 2009, BioNLP@HLT-NAACL.

[54]  Stephan Oepen,et al.  Resolving Speculation: MaxEnt Cue Classification and Dependency-Based Scope Rules , 2010, CoNLL Shared Task.

[55]  Sumithra Velupillai,et al.  Levels of certainty in knowledge-intensive corpora: an initial annotation study , 2010, NeSp-NLP@ACL.

[56]  János Csirik,et al.  The CoNLL-2010 Shared Task: Learning to Detect Hedges and their Scope in Natural Language Text , 2010, CoNLL Shared Task.

[57]  Victoria L. Rubin Epistemic modality: From uncertainty to certainty in the context of information seeking as interactions with texts , 2010, Inf. Process. Manag..

[58]  Sophia Ananiadou,et al.  Evaluating a meta-knowledge annotation scheme for bio-events , 2010, NeSp-NLP@ACL.

[59]  Owen Rambow,et al.  Automatic Committed Belief Tagging , 2010, COLING.

[60]  Hercules Dalianis,et al.  Creating and evaluating a consensus for negated and speculative words in a Swedish clinical corpus , 2010, NeSp-NLP@ACL.

[61]  Karo Moilanen Packed Feelings and Ordered Sentiments: Sentiment Parsing with Quasi−compositional Polarity Sequencing and Compression , 2010 .