Classifying Semantic Relations in Bioscience Texts

A crucial step toward the goal of automatic extraction of propositional information from natural language text is the identification of semantic relations between constituents in sentences. We examine the problem of distinguishing among seven relation types that can occur between the entities "treatment" and "disease" in bioscience text, and the problem of identifying such entities. We compare five generative graphical models and a neural network, using lexical, syntactic, and semantic features, finding that the latter help achieve high classification accuracy.

[1]  Barry Smith,et al.  Proceedings of the AMIA Symposium , 2005 .

[2]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[3]  Ronen Feldman,et al.  Mining biomedical literature using information extraction , .

[4]  Mark Craven,et al.  Representing Sentence Structure in Hidden Markov Models for Information Extraction , 2001, IJCAI.

[5]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[6]  Michael Krauthammer,et al.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles , 2001, ISMB.

[7]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[8]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[9]  Lawrence Hunter,et al.  Mining molecular binding terminology from biomedical text , 1999, AMIA.

[10]  Mark Craven,et al.  Learning to Extract Relations from MEDLINE , 1999 .

[11]  Padmini Srinivasan,et al.  Exploring text mining from MEDLINE , 2002, AMIA.

[12]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[13]  Mirella Lapata,et al.  Proceedings of EMNLP 2004 , 2004 .

[14]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[15]  James Pustejovsky,et al.  Robust Relational Parsing Over Biomedical Literature: Extracting Inhibit Relations , 2001, Pacific Symposium on Biocomputing.

[16]  Lawrence Hunter,et al.  Extracting Molecular Binding Relationships from Biomedical Text , 2000, ANLP.

[17]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[18]  Barbara Rosario,et al.  The Descent of Hierarchy, and Selection in Relational Semantics , 2002, ACL.

[19]  Roger Levy,et al.  A Generative Model for Semantic Role Labeling , 2003, ECML.