Embedding Electronic Health Records for Clinical Information Retrieval

Neural network representation learning frameworks have recently shown to be highly effective at a wide range of tasks ranging from radiography interpretation via data-driven diagnostics to clinical decision support. This often superior performance comes at the price of dramatically increased training data requirements that cannot be satisfied in every given institution or scenario. As a means of countering such data sparsity effects, distant supervision alleviates the need for scarce in-domain data by relying on a related, resource-rich, task for training. This study presents an end-to-end neural clinical decision support system that recommends relevant literature for individual patients (few available resources) via distant supervision on the well-known MIMIC-III collection (abundant resource). Our experiments show significant improvements in retrieval effectiveness over traditional statistical as well as purely locally supervised retrieval models.

[1]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[2]  Carsten Eickhoff,et al.  ETH Zurich at TREC Clinical Decision Support 2016 , 2016, TREC.

[3]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[4]  Thomas Hofmann,et al.  Machine learning for real-time prediction of complications in critical care: a retrospective study. , 2018, The Lancet. Respiratory medicine.

[5]  W. Bruce Croft,et al.  An evaluation of query processing strategies using the TIPSTER collection , 1993, SIGIR.

[6]  Florian Schmidt,et al.  Neural Document Embeddings for Intensive Care Patient Mortality Prediction , 2016, NIPS 2016.

[7]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[8]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[9]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[10]  Carsten Eickhoff,et al.  Implicit Negative Feedback in Clinical Information Retrieval , 2016, ArXiv.

[11]  Ellen M. Voorhees,et al.  Overview of the TREC 2014 Clinical Decision Support Track , 2014, TREC.

[12]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[13]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[14]  Luca Soldaini QuickUMLS: a fast, unsupervised approach for medical concept extraction , 2016 .

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Everton Alvares Cherman,et al.  Multi-label Problem Transformation Methods: a Case Study , 2011, CLEI Electron. J..

[17]  Carsten Eickhoff,et al.  Biomedical Question Answering via Weighted Neural Network Passage Retrieval , 2018, ECIR.