Learning the Scope of Hedge Cues in Biomedical Texts

Identifying hedged information in biomedical literature is an important subtask in information extraction because it would be misleading to extract speculative information as factual information. In this paper we present a machine learning system that finds the scope of hedge cues in biomedical texts. The system is based on a similar system that finds the scope of negation cues. We show that the same scope finding approach can be applied to both negation and hedging. To investigate the robustness of the approach, the system is tested on the three subcorpora of the BioScope corpus that represent different text types.

[1]  Michael Krauthammer,et al.  Of truth and pathways: chasing bits of information through myriads of articles , 2002, ISMB.

[2]  Sophia Ananiadou,et al.  Developing a Robust Part-of-Speech Tagger for Biomedical Text , 2005, Panhellenic Conference on Informatics.

[3]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[4]  Masaki Murata,et al.  Extracting Protein-Protein Interaction Information from Biomedical Text with SVM , 2006, IEICE Trans. Inf. Syst..

[5]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[6]  Roser Morante,et al.  A Metalearning Approach to Processing the Scope of Negation , 2009, CoNLL.

[7]  Ted Briscoe,et al.  Weakly Supervised Learning for Hedge Classification in Scientific Literature , 2007, ACL.

[8]  Jun'ichi Tsujii,et al.  Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data , 2005, HLT.

[9]  György Szarvas,et al.  Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords , 2008, ACL.

[10]  Robert E. Mercer,et al.  Hedging in Scientific Articles as a Means of Classifying Citations , 2004 .

[11]  Panayiotis Bozanis,et al.  Advances in Informatics: 10th Panhellenic Conference on Informatics, PCI 2005, Volas, Greece, November 11-13, 2005, Proceedings (Lecture Notes in Computer Science) , 2005 .

[12]  K. Hyland,et al.  Hedging in scientific research articles , 1998 .

[13]  A. Valencia,et al.  Linking genes to literature: text mining, information extraction, and retrieval applications for biology , 2008, Genome Biology.

[14]  A. Valencia,et al.  Overview of the protein-protein interaction annotation extraction task of BioCreative II , 2008, Genome Biology.

[15]  Ben Medlock,et al.  Exploring hedge identification in biomedical literature , 2008, J. Biomed. Informatics.

[16]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[18]  George Lakoff,et al.  Hedges: A study in meaning criteria and the logic of fuzzy concepts , 1973, J. Philos. Log..

[19]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[20]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[21]  Padmini Srinivasan,et al.  The Language of Bioscience: Facts, Speculations, and Statements In Between , 2004, HLT-NAACL 2004.

[22]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[23]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[24]  K. Hengeveld Mood and modality , 2004 .

[25]  Nigel Collier,et al.  The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers , 1999, EACL.

[26]  Halil Kilicoglu,et al.  Recognizing speculative language in biomedical research articles: a linguistically motivated perspective , 2008, BMC Bioinformatics.

[27]  János Csirik,et al.  The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts , 2008, BioNLP.

[28]  Sophia Ananiadou,et al.  Categorising Modality in Biomedical Texts , 2008, LREC 2008.

[29]  James Pustejovsky,et al.  Annotating and Recognizing Event Modality in Text , 2006, FLAIRS.