Learning an expandable EMR-based medical knowledge network to enhance clinical diagnosis

Electronic medical records (EMRs) contain a wealth of knowledge that can be used to assist doctors in making clinical decisions like disease diagnosis. Constructing a medical knowledge network (MKN) to link medical concepts in EMRs is an effective way to manage this knowledge. The quality of the diagnostic result made by MKN-based clinical decision support system depends on the accuracy of medical knowledge and the completeness of the network. However, collecting knowledge is a long-lasting and cumulative process, which means it's hard to construct a complete MKN with limited data. This study was conducted with the objective of developing an expandable EMR-based MKN to enhance capabilities in making an initial clinical diagnosis. A network of symptom-indicate-disease knowledge in 992 Chinese EMRs (CEMRs) was manually constructed as Original-MKN, and an incremental expansion framework was applied to it to obtain an expandable MKN based on new CEMRs. The framework was composed by: (1) integrating external knowledge extracted from the medical information websites and (2) mining potential knowledge with new EMRs. The framework also adopts a diagnosis-driven learning method to estimate the effectiveness of each knowledge in clinical practice. Experimental results indicate that our expanded MKN achieves a precision of 0.837 for a recall of 0.719 in clinical diagnosis, which outperforms Original-MKN and four classical machine learning methods. Furthermore, both external medical knowledge and potential medical knowledge benefit MKN expansion and disease diagnosis. The proposed incremental expansion framework sustains the MKN learning new knowledge.

[1]  Bin Dong,et al.  Building a comprehensive syntactic and semantic corpus of Chinese clinical texts , 2016, J. Biomed. Informatics.

[2]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[3]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[4]  Takashi Tahara,et al.  The networks from medical knowledge and clinical practice have small-world, scale-free, and hierarchical features , 2013 .

[5]  Eberhard Korsching,et al.  Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation , 2014, Database J. Biol. Databases Curation.

[6]  Fang Liu,et al.  Data Processing and Text Mining Technologies on Electronic Medical Records: A Review , 2018, Journal of healthcare engineering.

[7]  Ramakanth Kavuluru,et al.  Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations , 2018, J. Biomed. Informatics.

[8]  W. Art Chaovalitwongse,et al.  Healthcare Intelligence: Turning Data into Knowledge , 2014, IEEE Intelligent Systems.

[9]  David Sontag,et al.  Learning a Health Knowledge Graph from Electronic Medical Records , 2017, Scientific Reports.

[10]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[11]  H. E. Pople,et al.  Internist-I, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine , 1982 .

[12]  Kenneth S. Murray,et al.  KI: A Tool for Knowledge Integration , 1996, AAAI/IAAI, Vol. 1.

[13]  Wei Zhang,et al.  From Data Fusion to Knowledge Fusion , 2014, Proc. VLDB Endow..

[14]  Raymond Reiter On Closed World Data Bases , 1977, Logic and Data Bases.

[15]  Yizhou Li,et al.  Prediction of adverse drug reactions by a network based external link prediction method , 2013 .

[16]  Zhiming Xu,et al.  A study of EMR-based medical knowledge network and its applications , 2017, Comput. Methods Programs Biomed..

[17]  Meng Wang,et al.  Disease Inference from Health-Related Questions via Sparse Deep Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[18]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[19]  J. Antaki,et al.  A Bayesian Model to Predict Right Ventricular Failure Following Left Ventricular Assist Device Therapy. , 2016, JACC. Heart failure.

[20]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[21]  Marcos Faúndez-Zanuy,et al.  On Automatic Diagnosis of Alzheimer’s Disease Based on Spontaneous Speech Analysis and Emotional Temperature , 2013, Cognitive Computation.

[22]  Pengfei Jiao,et al.  Link predication based on matrix factorization by fusion of multi class organizations of the network , 2017, Scientific Reports.

[23]  V. Hasselblad,et al.  Effect of Clinical Decision-Support Systems , 2012, Annals of Internal Medicine.

[24]  Hongfang Liu,et al.  Journal of Biomedical Informatics , 2022 .

[25]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[26]  Bin He,et al.  CRFs based de-identification of medical records , 2015, J. Biomed. Informatics.

[27]  Madjid Fathi,et al.  Knowledge-based medical system integration to foster knowledge transfer and network building , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[28]  Min Chen,et al.  Disease Prediction by Machine Learning Over Big Data From Healthcare Communities , 2017, IEEE Access.

[29]  André Kushniruk,et al.  Analysis of Complex Decision-Making Processes in Health Care: Cognitive Approaches to Health Informatics , 2001, J. Biomed. Informatics.

[30]  Jianhua Ruan,et al.  A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity , 2013, Bioinform..

[31]  Hayet Farida Merouani,et al.  Maintenance of a Bayesian network: application using medical diagnosis , 2016, Evol. Syst..

[32]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[33]  Zhuang Yan,et al.  A Survey on Entity Alignment of Knowledge Base , 2016 .

[34]  Chao Zhao,et al.  Learning and inference in knowledge-based probabilistic model for medical diagnosis , 2017, Knowl. Based Syst..

[35]  Jun Cheng,et al.  A Wearable Smartphone-Based Platform for Real-Time Cardiovascular Disease Detection Via Electrocardiogram Processing , 2010, IEEE Transactions on Information Technology in Biomedicine.

[36]  Andrew P. Bradley,et al.  Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus , 2010, IEEE Transactions on Information Technology in Biomedicine.

[37]  W W Stead,et al.  Computerized medical records , 2004, Journal of Medical Systems.

[38]  Yijia Zhang,et al.  A hybrid model based on neural networks for biomedical relation extraction , 2018, J. Biomed. Informatics.

[39]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[40]  Hua Xu,et al.  A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries , 2011, J. Am. Medical Informatics Assoc..

[41]  Yang Jin,et al.  An Overview of Research on Electronic Medical Record Oriented Named Entity Recognition and Entity Relation Extraction , 2014 .

[42]  Xiaoxia Liu,et al.  SemaTyP: a knowledge graph based literature mining method for drug discovery , 2018, BMC Bioinformatics.

[43]  Erik M. van Mulligen,et al.  Knowledge-based extraction of adverse drug events from biomedical text , 2014, BMC Bioinformatics.

[44]  Yu Zheng,et al.  Methodologies for Cross-Domain Data Fusion: An Overview , 2015, IEEE Transactions on Big Data.

[45]  Yi-Ping Phoebe Chen,et al.  Computational intelligence for heart disease diagnosis: A medical knowledge driven approach , 2013, Expert Syst. Appl..

[46]  K. Bhaskaran,et al.  Data Resource Profile: Clinical Practice Research Datalink (CPRD) , 2015, International journal of epidemiology.

[47]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[48]  Chao Zhao,et al.  Max-margin weight learning for medical knowledge network , 2018, Comput. Methods Programs Biomed..

[49]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[50]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[51]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[52]  Yang Liu,et al.  Application of the clinical decision support systems in the management of chronic diseases , 2016, 2016 3rd International Conference on Systems and Informatics (ICSAI).

[53]  Randolph A. Miller,et al.  Discovering hidden knowledge through auditing clinical diagnostic knowledge bases , 2018, J. Biomed. Informatics.

[54]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[55]  Selvakumar Manickam,et al.  Leveraging XML-based electronic medical records to extract experiential clinical knowledge: An automated approach to generate cases for medical case-based reasoning systems , 2002, Int. J. Medical Informatics.