论文信息 - Risk Event and Probability Extraction for Modeling Medical Risks

Risk Event and Probability Extraction for Modeling Medical Risks

In this paper we address the task of extracting risk events and probabilities from free text, focusing in particular on the biomedical domain. While our initial motivation is to enable the determination of the parameters of a Bayesian belief network, our approach is not specific to that use case. We are the first to investigate this task as a sequence tagging problem where we label spans of text as events A or B that are then used to construct probability statements of the form P(A|B)=x. We show that our approach significantly outperforms an entity extraction baseline on a new annotated medical risk event corpus. We also explore semi-supervised methods that lead to modest improvement, encouraging further work in this direction.

Bogdan Sacaleanu | Charles Jochim | Léa A. Deleris

[1] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2] Scott M. Smith,et al. Computer Intensive Methods for Testing Hypotheses: An Introduction , 1989 .

[3] M Elisabeth Paté-Cornell,et al. Early technology assessment of new medical devices , 2008, International Journal of Technology Assessment in Health Care.

[4] Peter J. Haug,et al. ILIAD as an Expert Consultant to Teach Differential Diagnosis , 1988 .

[5] Gee Liek Yeo,et al. Engineering Risk Analysis of a Hospital Oxygen Supply System , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[6] Jari Björne,et al. TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared Task , 2013, BioNLP@ACL.

[7] Alan R. Aronson,et al. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[8] Alex A. T. Bui,et al. Evaluation of a Dynamic Bayesian Belief Network to Predict Osteoarthritic Knee Pain Using Data from the Osteoarthritis Initiative , 2008, AMIA.

[9] D. Roth. 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[10] Jun'ichi Tsujii,et al. GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[11] Pierre Zweigenbaum,et al. Medical Entity Recognition: A Comparaison of Semantic and Statistical Methods , 2011, BioNLP@ACL.

[12] Steven Abney,et al. Semisupervised Learning for Computational Linguistics , 2007 .

[13] Martin Theobald,et al. Extraction of Conditional Probabilities of the Relationships Between Drugs, Diseases, and Genes from PubMed Guided by Relationships in PharmGKB , 2009, Summit on translational bioinformatics.

[14] G. Octo Barnett,et al. DXplain: Patterns of Use of a Mature Expert System , 2005, AMIA.

[15] Silja Renooij,et al. Probabilities for a probabilistic network: a case study in oesophageal cancer , 2002, Artif. Intell. Medicine.

[16] Nigel Collier,et al. Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[17] G. Fuller,et al. Simulconsult: www.simulconsult.com , 2005, Journal of Neurology, Neurosurgery & Psychiatry.