Improving Clinical Diagnosis Inference through Integration of Structured and Unstructured Knowledge

This paper presents a novel approach to the task of automatically inferring the most probable diagnosis from a given clinical narrative. Structured Knowledge Bases (KBs) can be useful for such complex tasks but not sufficient. Hence, we leverage a vast amount of unstructured free text to integrate with structured KBs. The key innovative ideas include building a concept graph from both structured and unstructured knowledge sources and ranking the diagnosis concepts using the enhanced word embedding vectors learned from integrated sources. Experiments on the TREC CDS and HumanDx datasets showed that our methods improved the results of clinical diagnosis inference.

[1]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[2]  Jason Weston,et al.  Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction , 2013, EMNLP.

[3]  Alexander Kotov,et al.  Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries , 2016, ICTIR.

[4]  Tiejun Zhao,et al.  Knowledge-Based Question Answering as Machine Translation , 2014, ACL.

[5]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.

[6]  Erik T. Mueller,et al.  Watson: Beyond Jeopardy! , 2013, Artif. Intell..

[7]  Xuchen Yao,et al.  Information Extraction over Structured Data: Question Answering with Freebase , 2014, ACL.

[8]  Sanda M. Harabagiu,et al.  Medical Question Answering for Clinical Decision Support , 2016, CIKM.

[9]  Oladimeji Farri,et al.  Clinical Question Answering using Key-Value Memory Networks and Knowledge Graph , 2016, TREC.

[10]  Ming Zhou,et al.  Question Answering over Freebase with Multi-Column Convolutional Neural Networks , 2015, ACL.

[11]  Zhen Wang,et al.  Knowledge Graph and Text Jointly Embedding , 2014, EMNLP.

[12]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[13]  Oladimeji Farri,et al.  Using Neural Embeddings for Diagnostic Inferencing in Clinical Question Answering , 2015, TREC.

[14]  Yu Hu,et al.  Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints , 2015, ACL.

[15]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[16]  Siddharth Patwardhan,et al.  WatsonPaths: Scenario-Based Question Answering and Inference over Unstructured Information , 2017, AI Mag..

[17]  Rahul Gupta,et al.  Knowledge base completion via search-based question answering , 2014, WWW.

[18]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[19]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[20]  Xiaojun Wan,et al.  Graph-Based Multi-Modality Learning for Clinical Decision Support , 2016, CIKM.

[21]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[22]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[23]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[24]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[25]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[26]  Ellen M. Voorhees,et al.  Overview of the TREC 2014 Clinical Decision Support Track , 2014, TREC.

[27]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.