Certainty Identification in Texts: Categorization Model and Manual Tagging Results

This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in texts. Statements with explicit certainty markers were identified and categorized according to four hypothesized dimensions — level, perspective, focus, and time of certainty. The preliminary results reveal an overall promising picture of the presence of certainty information in texts, and establish its susceptibility to manual identification within the proposed four-dimensional certainty categorization analytical framework. Our findings are that the editorial sample group had a significantly higher frequency of markers per sentence than did the sample group of the news stories. For editorials, high level of certainty, writer’s point of view, and future and present time were the most populated categories. For news stories, the most common categories were high and moderate levels, directly involved third party’s point of view, and past time. These patterns have positive practical implications for automation.

[1]  George Lakoff,et al.  Hedges: A Study In Meaning Criteria And The Logic Of Fuzzy Concepts , 1973 .

[2]  George Lakoff,et al.  Hedges: A study in meaning criteria and the logic of fuzzy concepts , 1973, J. Philos. Log..

[3]  J. Searle Expression and Meaning: Studies in the Theory of Speech Acts , 1979 .

[4]  T. V. Dijk Studies In The Pragmatics Of Discourse , 1981 .

[5]  J. Coates The semantics of the modal auxiliaries , 1983 .

[6]  Johanna Nichols,et al.  Evidentiality: The Linguistic Coding of Epistemology , 1986 .

[7]  J. Holmes Hedges and boosters in women's and men's speech , 1990 .

[8]  René J. Cappon The Associated Press guide to news writing , 1991 .

[9]  Sabine Bergler,et al.  Lexical Structures or Linguistic Inference , 1991, SIGLEX Workshop.

[10]  James Pustejovsky,et al.  Lexical Semantics and Knowledge Representation , 1991, Lecture Notes in Computer Science.

[11]  Elizabeth D. Liddy,et al.  Development, Implementation and Testing of a Discourse Model for Newspaper Texts , 1993, HLT.

[12]  Janyce Wiebe,et al.  Tracking Point of View in Narrative , 1994, Comput. Linguistics.

[13]  Elizabeth D. Liddy,et al.  Development and Implementation of a Discourse Model for Newspaper Texts , 1995 .

[14]  K. Hyland,et al.  Hedging in scientific research articles , 1998 .

[15]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[16]  Janyce Wiebe,et al.  A Corpus Study of Evaluative and Speculative Language , 2001, SIGDIAL Workshop.

[17]  Ilana Mushin Evidentiality and epistemological stance , 2001 .

[18]  T. Curnow Evidentiality and Epistemological Stance: Narrative Retelling (review) , 2003 .

[19]  Elizabeth D. Liddy,et al.  Discerning Emotions in Texts , 2004, AAAI 2004.