CBN: Constructing a clinical Bayesian network based on data from the electronic medical record

The process of learning candidate causal relationships involving diseases and symptoms from electronic medical records (EMRs) is the first step towards learning models that perform diagnostic inference directly from real healthcare data. However, the existing diagnostic inference systems rely on knowledge bases such as ontology that are manually compiled through a labour-intensive process or automatically derived using simple pairwise statistics. We explore CBN, a Clinical Bayesian Network construction for medical ontology probabilistic inference, to learn high-quality Bayesian topology and complete ontology directly from EMRs. Specifically, we first extract medical entity relationships from over 10,000 deidentified patient records and adopt the odds ratio (OR value) calculation and the K2 greedy algorithm to automatically construct a Bayesian topology. Then, Bayesian estimation is used for the probability distribution. Finally, we employ a Bayesian network to complete the causal relationship and probability distribution of ontology to enhance the ontology inference capability. By evaluating the learned topology versus the expert opinions of physicians and entropy calculations and by calculating the ontology-based diagnosis classification, our study demonstrates that the direct and automated construction of a high-quality health topology and ontology from medical records is feasible. Our results are reproducible, and we will release the source code and CN-Stroke knowledge graph of this work after publication.1.

[1]  Daniel Nikovski,et al.  Constructing Bayesian Networks for Medical Diagnosis from Incomplete and Partially Correct Statistics , 2000, IEEE Trans. Knowl. Data Eng..

[2]  Afrânio Lineu Kritski,et al.  Neural network models for supporting drug and multidrug resistant tuberculosis screening diagnosis , 2017, Neurocomputing.

[3]  Jin Xu,et al.  Fast Parallel Path Concatenation for Graph Extraction , 2017, IEEE Transactions on Knowledge and Data Engineering.

[4]  Yang Fei,et al.  Improve artificial neural network for medical analysis, diagnosis and prediction. , 2017, Journal of critical care.

[5]  Mary K Goldstein,et al.  Accuracy of computerized outpatient diagnoses in a Veterans Affairs general medicine clinic. , 2002, The American journal of managed care.

[6]  Charlotte A. Weaver,et al.  Enhancing patient safety and quality of care by improving the usability of electronic health record systems: recommendations from AMIA. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[7]  P. Bossuyt,et al.  The diagnostic odds ratio: a single indicator of test performance. , 2003, Journal of clinical epidemiology.

[8]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[9]  David Sontag,et al.  Learning a Health Knowledge Graph from Electronic Medical Records , 2017, Scientific Reports.

[10]  Lotfi A. Zadeh A prototype-centered approach to adding deduction capability to search engines-the concept of protoform , 2002, 2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622).

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Siu Cheung Hui,et al.  Automatic fuzzy ontology generation for semantic Web , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Yaron Denekamp,et al.  Mapping computerized clinical guidelines to electronic medical records: Knowledge-data ontological mapper (KDOM) , 2008, J. Biomed. Informatics.

[15]  Andrzej Skowron,et al.  Rough sets: Some extensions , 2007, Inf. Sci..

[16]  David W. Bates,et al.  A method and knowledge base for automated inference of patient problems from structured data in an electronic medical record , 2011, J. Am. Medical Informatics Assoc..

[17]  Marek J. Druzdzel,et al.  Impact of Bayesian Network Model Structure on the Accuracy of Medical Diagnostic Systems , 2014, ICAISC.

[18]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[19]  George Hripcsak,et al.  Mining a clinical data warehouse to discover disease-finding associations using co-occurrence statistics , 2005, AMIA.

[20]  Yun Peng,et al.  BayesOWL: Uncertainty Modeling in Semantic Web Ontologies , 2006 .

[21]  Yun Peng,et al.  A probabilistic extension to ontology language OWL , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[22]  Peter J. Haug,et al.  Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation , 2006, J. Biomed. Informatics.

[23]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[24]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[25]  Jun Zhao,et al.  Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions , 2017, AAAI.

[26]  Laurianne Sitbon,et al.  Towards semantic search and inference in electronic medical records: An approach using concept--based information retrieval. , 2012, The Australasian medical journal.

[27]  Ningkang Jiang,et al.  Construction of Simulation for Probabilistic Inference in Uncertain and Dynamic Networks Based on Bayesian Networks , 2006, 2006 6th International Conference on ITS Telecommunications.

[28]  Paulo Cesar G. da Costa,et al.  PR-OWL: A Bayesian Ontology Language for the Semantic Web , 2005, ISWC-URSW.

[29]  William Marsh,et al.  From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support , 2016, Artif. Intell. Medicine.

[30]  Steffen Staab,et al.  International Handbooks on Information Systems , 2013 .

[31]  Bellandi Andrea,et al.  Mining Bayesian networks out of ontologies , 2012, Journal of Intelligent Information Systems.

[32]  Haipeng Shen,et al.  Artificial intelligence in healthcare: past, present and future , 2017, Stroke and Vascular Neurology.

[33]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..