Joint Inference for Knowledge Extraction from Biomedical Literature

Knowledge extraction from online repositories such as PubMed holds the promise of dramatically speeding up biomedical research and drug design. After initially focusing on recognizing proteins and binary interactions, the community has recently shifted their attention to the more ambitious task of recognizing complex, nested event structures. State-of-the-art systems use a pipeline architecture in which the candidate events are identified first, and subsequently the arguments. This fails to leverage joint inference among events and arguments for mutual disambiguation. Some joint approaches have been proposed, but they still lag much behind in accuracy. In this paper, we present the first joint approach for bio-event extraction that obtains state-of-the-art results. Our system is based on Markov logic and adopts a novel formulation by jointly predicting events and arguments, as well as individual dependency edges that compose the argument paths. On the BioNLP'09 Shared Task dataset, it reduced F1 errors by more than 10% compared to the previous best joint approach.

[1]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[2]  Jari Björne,et al.  Extracting Complex Biological Events with Rich Graph-Based Feature Sets , 2009, BioNLP@HLT-NAACL.

[3]  Thomas Hofmann,et al.  Predicting Structured Data (Neural Information Processing) , 2007 .

[4]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[5]  Pedro M. Domingos,et al.  Efficient Weight Learning for Markov Logic Networks , 2007, PKDD.

[6]  Sampo Pyysalo,et al.  Overview of BioNLP’09 Shared Task on Event Extraction , 2009, BioNLP@HLT-NAACL.

[7]  Jun'ichi Tsujii,et al.  From Protein-Protein Interaction to Molecular Event Extraction , 2009, BioNLP@HLT-NAACL.

[8]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[9]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[10]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[11]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[12]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[13]  Pedro M. Domingos,et al.  Joint Inference in Information Extraction , 2007, AAAI.

[14]  Hoifung Poon,et al.  Unsupervised Morphological Segmentation with Log-Linear Models , 2009, NAACL.

[15]  Pedro M. Domingos,et al.  Sound and Efficient Inference with Probabilistic and Deterministic Dependencies , 2006, AAAI.

[16]  Jun'ichi Tsujii,et al.  A Markov Logic Approach to Bio-Molecular Event Extraction , 2009, BioNLP@HLT-NAACL.

[17]  Matthew Richardson,et al.  The Alchemy System for Statistical Relational AI: User Manual , 2007 .

[18]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .