Enhancing ontology-driven diagnostic reasoning with a symptom-dependency-aware Naïve Bayes classifier

BackgroundOntology has attracted substantial attention from both academia and industry. Handling uncertainty reasoning is important in researching ontology. For example, when a patient is suffering from cirrhosis, the appearance of abdominal vein varices is four times more likely than the presence of bitter taste. Such medical knowledge is crucial for decision-making in various medical applications but is missing from existing medical ontologies. In this paper, we aim to discover medical knowledge probabilities from electronic medical record (EMR) texts to enrich ontologies. First, we build an ontology by identifying meaningful entity mentions from EMRs. Then, we propose a symptom-dependency-aware naïve Bayes classifier (SDNB) that is based on the assumption that there is a level of dependency among symptoms. To ensure the accuracy of the diagnostic classification, we incorporate the probability of a disease into the ontology via innovative approaches.ResultsWe conduct a series of experiments to evaluate whether the proposed method can discover meaningful and accurate probabilities for medical knowledge. Based on over 30,000 deidentified medical records, we explore 336 abdominal diseases and 81 related symptoms. Among these 336 gastrointestinal diseases, the probabilities of 31 diseases are obtained via our method. These 31 probabilities of diseases and 189 conditional probabilities between diseases and the symptoms are added into the generated ontology.ConclusionIn this paper, we propose a medical knowledge probability discovery method that is based on the analysis and extraction of EMR text data for enriching a medical ontology with probability information. The experimental results demonstrate that the proposed method can effectively identify accurate medical knowledge probability information from EMR data. In addition, the proposed method can efficiently and accurately calculate the probability of a patient suffering from a specified disease, thereby demonstrating the advantage of combining an ontology and a symptom-dependency-aware naïve Bayes classifier.

[1]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[2]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[3]  R. Haynes,et al.  Effects of Computer-based Clinical Decision Support Systems on Clinician Performance and Patient Outcome: A Critical Appraisal of Research , 1994, Annals of Internal Medicine.

[4]  J. Wyatt Decision support systems. , 2000, Journal of the Royal Society of Medicine.

[5]  P. Bossuyt,et al.  The diagnostic odds ratio: a single indicator of test performance. , 2003, Journal of clinical epidemiology.

[6]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[7]  S. Edberg Global Infectious Diseases and Epidemiology Network (GIDEON): a world wide Web-based program for diagnosis and informatics in infectious diseases. , 2005, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[8]  Alan L. Rector,et al.  Web ontology segmentation: analysis, classification and use , 2006, WWW '06.

[9]  Hyoil Han,et al.  A survey on ontology mapping , 2006, SGMD.

[10]  Yi-Hsing Chang,et al.  An Automatic Document Classifier System based on Naíve Bayes Classifier and Ontology , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[11]  J. Kazmierska,et al.  Application of the Naïve Bayesian Classifier to optimize treatment decisions. , 2008, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[12]  Albert-László Barabási,et al.  A Dynamic Network Approach for the Study of Human Phenotypes , 2009, PLoS Comput. Biol..

[13]  Hyunki Kim,et al.  Associative Naïve Bayes classifier: Automated linking of gene ontology to medline documents , 2009, Pattern Recognit..

[14]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[15]  Gerhard Weikum,et al.  YAGO2: exploring and querying world knowledge in time, space, context, and many languages , 2011, WWW.

[16]  David W. Bates,et al.  A method and knowledge base for automated inference of patient problems from structured data in an electronic medical record , 2011, J. Am. Medical Informatics Assoc..

[17]  Peter N. Robinson,et al.  Introduction to Bio-Ontologies , 2011 .

[18]  Xiaoxin Yin,et al.  Semi-supervised truth discovery , 2011, WWW.

[19]  S. K. Srivatsa,et al.  Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method , 2011 .

[20]  Min Li,et al.  A knowledge discovery and reuse pipeline for information extraction in clinical notes , 2011, J. Am. Medical Informatics Assoc..

[21]  Naveen Kumar Korada Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Maize Expert System , 2012 .

[22]  Oussama El-Rawas,et al.  Machine learning-based coreference resolution of concepts in clinical documents , 2012, J. Am. Medical Informatics Assoc..

[23]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[24]  Shuying Shen,et al.  Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure , 2012, J. Am. Medical Informatics Assoc..

[25]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[26]  Liangxiao Jiang,et al.  Improving Tree augmented Naive Bayes for class probability estimation , 2012, Knowl. Based Syst..

[27]  Cui Tao,et al.  Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification , 2013, J. Am. Medical Informatics Assoc..

[28]  Nick Bassiliades,et al.  Ontology-based sentiment analysis of twitter posts , 2013, Expert Syst. Appl..

[29]  Cynthia Brandt,et al.  Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification , 2013, J. Am. Medical Informatics Assoc..

[30]  Rodney D. Nielsen,et al.  Towards comprehensive syntactic and semantic annotations of the clinical narrative , 2013, J. Am. Medical Informatics Assoc..

[31]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[32]  L. Bisson,et al.  Accuracy of a Computer-Based Diagnostic Program for Ambulatory Patients With Knee Pain , 2014, The American journal of sports medicine.

[33]  E. Burnside,et al.  Development of an online, publicly accessible naive Bayesian decision support tool for mammographic mass lesions based on the American College of Radiology (ACR) BI-RADS lexicon , 2015, European Radiology.

[34]  A. Barabasi,et al.  Human symptoms–disease network , 2014, Nature Communications.

[35]  Chengqi Zhang,et al.  Attribute weighting: How and when does it work for Bayesian Network Classification , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[36]  Joshua C. Denny,et al.  Automated Classification of Consumer Health Information Needs in Patient Portal Messages , 2015, AMIA.

[37]  Cheng Li,et al.  Hierarchical Bayesian nonparametric models for knowledge discovery from electronic medical records , 2016, Knowl. Based Syst..

[38]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.

[39]  Chao Zhao,et al.  Learning and inference in knowledge-based probabilistic model for medical diagnosis , 2017, Knowl. Based Syst..

[40]  Jean-Baptiste Lamy,et al.  Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies , 2017, Artif. Intell. Medicine.

[41]  Heiner Stuckenschmidt,et al.  Marrying Uncertainty and Time in Knowledge Graphs , 2017, AAAI.

[42]  Olivier Ferret,et al.  Neural Architecture for Temporal Relation Extraction: A Bi-LSTM Approach for Detecting Narrative Containers , 2017, ACL.

[43]  Nagiza F. Samatova,et al.  Learning Entity Type Embeddings for Knowledge Graph Completion , 2017, CIKM.

[44]  Wen Chen Beijing, China , 2019, The Statesman’s Yearbook Companion.

[45]  Min Yang,et al.  An in-depth study of similarity predicate committee , 2019, Inf. Process. Manag..

[46]  Min Yang,et al.  Path-based Attribute-aware Representation Learning for Relation Prediction , 2019, SDM.