SECNLP: A Survey of Embeddings in Clinical Natural Language Processing

Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture prior knowledge which can transferred to downstream tasks. Even though embeddings have become de facto standard for text representation in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance research in clinical embeddings.

[1]  Nigam H. Shah,et al.  Building the graph of medicine from millions of clinical narratives , 2014, Scientific Data.

[2]  Zhenchao Jiang,et al.  Training word embeddings for deep learning in biomedical text mining tasks , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[3]  Nigel Collier,et al.  Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation , 2016, ACL.

[4]  Tapio Salakoski,et al.  Care episode retrieval: distributional semantic models for information retrieval in the clinical domain , 2014, BMC Medical Informatics and Decision Making.

[5]  Aykut Koç,et al.  Semantic Structure and Interpretability of Word Embeddings , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[6]  Ming Zhang,et al.  Topic medical concept embedding: Multi-sense representation learning for medical concept , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[7]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[8]  Yuan Luo,et al.  Recurrent Neural Networks for Classifying Relations in Clinical Notes , 2017, AMIA.

[9]  Qiang Zhang,et al.  Improving Medical Short Text Classification with Semantic Expansion Using Word-Cluster Embedding , 2018, ICISA.

[10]  Yu Cheng,et al.  Exploiting Convolutional Neural Network for Risk Prediction with Medical Feature Embedding , 2017, ArXiv.

[11]  Hong Yu,et al.  Structured prediction models for RNN based sequence labeling in clinical text , 2016, EMNLP.

[12]  Heng Ji,et al.  Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion , 2015, BioNLP@IJCNLP.

[13]  T. H. Kyaw,et al.  Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database* , 2011, Critical care medicine.

[14]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[15]  Alice H. Oh,et al.  Rotated Word Vector Representations and their Interpretability , 2017, EMNLP.

[16]  Jian Huang,et al.  Analyzing Multiple Medical Corpora Using Word Embedding , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).

[17]  Zoran Obradovic,et al.  Modeling Healthcare Quality via Compact Representations of Electronic Health Records , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Todd R. Johnson,et al.  Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness , 2017, MedInfo.

[19]  Yi Luo,et al.  Multi-Task Medical Concept Normalization Using Multi-View Convolutional Neural Network , 2018, AAAI.

[20]  Mohammed Alawad,et al.  Retrofitting Word Embeddings with the UMLS Metathesaurus for Clinical Information Extraction , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[21]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[22]  A. Glenberg,et al.  Symbol Grounding and Meaning: A Comparison of High-Dimensional and Embodied Theories of Meaning , 2000 .

[23]  Yi Pan,et al.  Automated ICD-9 Coding via A Deep Learning Approach , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[25]  G. Hartvigsen,et al.  Secondary Use of EHR: Data Quality Issues and Informatics Opportunities , 2010, Summit on translational bioinformatics.

[26]  Matthias Samwald,et al.  Exploring the Application of Deep Learning Techniques on Medical Text Corpora , 2014, MIE.

[27]  Johannes Fürnkranz,et al.  All-in Text: Learning Document, Label, and Word Representations Jointly , 2016, AAAI.

[28]  Nigam H. Shah,et al.  Learning Effective Representations from Clinical Notes , 2017, ArXiv.

[29]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[30]  Denis Newman-Griffis,et al.  Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility , 2018, BioNLP.

[31]  David A. Chambers,et al.  Implementation Research in Mental Health Services: an Emerging Science with Conceptual, Methodological, and Training challenges , 2008, Administration and Policy in Mental Health and Mental Health Services Research.

[32]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[33]  Tapio Salakoski,et al.  Distributional Semantics Resources for Biomedical Text Processing , 2013 .

[34]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[35]  Olivia R. Zhang,et al.  Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. , 2017, Journal of biomedical informatics.

[36]  Zhiyuan Liu,et al.  Online Learning of Interpretable Word Embeddings , 2015, EMNLP.

[37]  Beng Chin Ooi,et al.  Medical Concept Embedding with Time-Aware Attention , 2018, IJCAI.

[38]  S. Holmes,et al.  Measures of dependence between random vectors and tests of independence. Literature review , 2013, 1307.7383.

[39]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[40]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[41]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[42]  Xin Rong,et al.  word2vec Parameter Learning Explained , 2014, ArXiv.

[43]  Willie Boag,et al.  AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus , 2017, ArXiv.

[44]  Yulia Tsvetkov,et al.  Problems With Evaluation of Word Embeddings Using Word Similarity Tasks , 2016, RepEval@ACL.

[45]  Ming Yang,et al.  Entity recognition from clinical texts via recurrent neural network , 2017, BMC Medical Informatics and Decision Making.

[46]  Harsh Jhamtani,et al.  SPINE: SParse Interpretable Neural Embeddings , 2017, AAAI.

[47]  Kelly M. Hoffman,et al.  Racial bias in pain assessment and treatment recommendations, and false beliefs about biological differences between blacks and whites , 2016, Proceedings of the National Academy of Sciences.

[48]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[49]  Daniel Dajun Zeng,et al.  Mining e-cigarette adverse events in social media using Bi-LSTM recurrent neural network with word embedding representation , 2018, J. Am. Medical Informatics Assoc..

[50]  Daniel L. Rubin,et al.  Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort , 2018, J. Biomed. Informatics.

[51]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[52]  Fei Wang,et al.  Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[53]  Sanjeev Arora,et al.  Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[54]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[55]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[56]  Franck Dernoncourt,et al.  Feature-Augmented Neural Networks for Patient Note De-identification , 2016, ClinicalNLP@COLING 2016.

[57]  Sandeep Ayyar,et al.  Tagging Patient Notes With ICD-9 Codes , 2017 .

[58]  Guido Zuccon,et al.  Medical Semantic Similarity with a Neural Language Model , 2014, CIKM.

[59]  Michael Klompas,et al.  Uses of electronic health records for public health surveillance to advance public health. , 2015, Annual review of public health.

[60]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[61]  Alok N. Choudhary,et al.  Medical Concept Normalization for Online User-Generated Texts , 2017, 2017 IEEE International Conference on Healthcare Informatics (ICHI).

[62]  Pushpak Bhattacharyya,et al.  Deep Learning Architecture for Patient Data De-identification in Clinical Records , 2016, ClinicalNLP@COLING 2016.

[63]  Vasudeva Varma,et al.  Medical Persona Classification in Social Media , 2017, 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[64]  Franck Dernoncourt,et al.  NeuroNER: an easy-to-use program for named-entity recognition based on neural networks , 2017, EMNLP.

[65]  Xiaohua Hu,et al.  Integrating extra knowledge into word embedding models for biomedical NLP tasks , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[66]  Kenneth Jung,et al.  Effective Representations of Clinical Notes , 2017 .

[67]  Todd R. Johnson,et al.  Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures , 2016, Louhi@EMNLP.

[68]  Franck Dernoncourt,et al.  De-identification of patient notes with recurrent neural networks , 2016, J. Am. Medical Informatics Assoc..

[69]  Wesley De Neve,et al.  Multimedia Lab @ ACL WNUT NER Shared Task: Named Entity Recognition for Twitter Microposts using Distributed Word Representations , 2015, NUT@IJCNLP.

[70]  Noémie Elhadad,et al.  Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.

[71]  Pengtao Xie,et al.  Convolutional Neural Networks for Medical Diagnosis from Admission Notes , 2017, ArXiv.

[72]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[73]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[74]  Zhengya Sun,et al.  Multi-task Character-Level Attentional Networks for Medical Concept Normalization , 2018, Neural Processing Letters.

[75]  Massimo Piccardi,et al.  An Investigation of Recurrent Neural Architectures for Drug Name Recognition , 2016, Louhi@EMNLP.

[76]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.

[77]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[78]  Timothy A. Miller,et al.  Learning Patient Representations from Text , 2018, *SEM@NAACL-HLT.

[79]  David Sontag,et al.  Learning Low-Dimensional Representations of Medical Concepts , 2016, CRI.

[80]  Kirk Roberts,et al.  Assessing the Corpus Size vs. Similarity Trade-off for Word Embeddings in Clinical NLP , 2016, ClinicalNLP@COLING 2016.

[81]  Nigel Collier,et al.  Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages , 2015, EMNLP.

[82]  Yulia Tsvetkov,et al.  Sparse Overcomplete Word Vector Representations , 2015, ACL.

[83]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[84]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[85]  Massimo Piccardi,et al.  Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition , 2017, J. Biomed. Informatics.

[86]  Anne Cocos,et al.  Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts , 2017, J. Am. Medical Informatics Assoc..

[87]  Pushpak Bhattacharyya,et al.  Adapting Pre-trained Word Embeddings For Use In Medical Coding , 2017, BioNLP.

[88]  Nigel Collier,et al.  Modelling the Combination of Generic and Target Domain Embeddings in a Convolutional Neural Network for Sentence Classification , 2016, BioNLP@ACL.

[89]  Massimo Piccardi,et al.  Bidirectional LSTM-CRF for Clinical Concept Extraction , 2016, ClinicalNLP@COLING 2016.

[90]  Sergey I. Nikolenko,et al.  Medical concept normalization in social media posts with recurrent neural networks , 2018, J. Biomed. Informatics.

[91]  Slobodan Vucetic,et al.  Joint learning of representations of medical concepts and words from EHR data , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[92]  Elena Tutubalina,et al.  Identifying disease-related expressions in reviews using conditional random fields , 2017 .

[93]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[94]  Gerard de Melo,et al.  Medical Concept Embeddings via Labeled Background Corpora , 2016, LREC.

[95]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[96]  Ioannis Ch. Paschalidis,et al.  Clinical Concept Extraction with Contextual Word Embedding , 2018, NIPS 2018.

[97]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[98]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[99]  Yuan Luo,et al.  Clinical text classification with rule-based features and knowledge-guided convolutional neural networks , 2018, 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W).

[100]  Ning Chen,et al.  Patient outcome prediction via convolutional neural networks based on multi-granularity medical concept embedding , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[101]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[102]  Christopher Potts,et al.  Effective Feature Representation for Clinical Text Concept Extraction , 2018, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[103]  Kun Li,et al.  Leveraging text skeleton for de-identification of electronic medical records , 2018, BMC Medical Informatics and Decision Making.

[104]  Yuan Luo,et al.  Identifying patient smoking status from medical discharge records. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[105]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[106]  Xinyuan Zhang,et al.  Multi-Label Learning from Medical Plain Text with Convolutional Residual Models , 2018, MLHC.

[107]  M. S. Kirkman,et al.  Type 1 Diabetes Through the Life Span: A Position Statement of the American Diabetes Association , 2014, Diabetes Care.

[108]  Svetha Venkatesh,et al.  $\mathtt {Deepr}$: A Convolutional Net for Medical Records , 2016, IEEE Journal of Biomedical and Health Informatics.

[109]  Alexey Zobnin Rotations and Interpretability of Word Embeddings: The Case of the Russian Language , 2017, AIST.

[110]  Jimeng Sun,et al.  Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction , 2016, ArXiv.

[111]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[112]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[113]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[114]  Sarvnaz Karimi,et al.  Cadec: A corpus of adverse drug event annotations , 2015, J. Biomed. Informatics.

[115]  Elena Tutubalina,et al.  KFU at CLEF eHealth 2017 Task 1: ICD-10 Coding of English Death Certificates with Recurrent Neural Networks , 2017, CLEF.

[116]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[117]  S. Dumais Latent Semantic Analysis. , 2005 .

[118]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[119]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[120]  Qingyu Chen,et al.  BioWordVec, improving biomedical word embeddings with subword information and MeSH , 2019, Scientific Data.

[121]  Aron Henriksson,et al.  Representing Clinical Notes for Adverse Drug Event Detection , 2015, Louhi@EMNLP.

[122]  Yaoyun Zhang,et al.  A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text , 2015, AMIA.

[123]  Jimeng Sun,et al.  Using recurrent neural network models for early detection of heart failure onset , 2016, J. Am. Medical Informatics Assoc..

[124]  Ted Pedersen,et al.  Towards a framework for developing semantic relatedness reference standards , 2011, J. Biomed. Informatics.

[125]  Bridget T. McInnes,et al.  Vector representations of multi-word terms for semantic relatedness , 2018, J. Biomed. Informatics.

[126]  Spyros Kotoulas,et al.  Medical Text Classification using Convolutional Neural Networks , 2017, Studies in health technology and informatics.

[127]  Xiuwen Liu,et al.  Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases , 2018, BMC Medical Informatics and Decision Making.

[128]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[129]  Rui Dai,et al.  Classifying medical relations in clinical text via convolutional neural networks , 2018, Artif. Intell. Medicine.

[130]  Goran Nenadic,et al.  Using an Ensemble of Linear and Deep Learning Models in the SMM4H 2017 Medical Concept Normalisation Task , 2017, SMM4H@AMIA.

[131]  Yongbin Liu,et al.  Evaluating the granularity balance of hierarchical relationships within large biomedical terminologies towards quality improvement , 2017, J. Biomed. Informatics.

[132]  Walter Daelemans,et al.  Patient representation learning and interpretable evaluation using clinical notes , 2018, J. Biomed. Informatics.

[133]  Amitabha Karmakar,et al.  Classifying medical notes into standard disease codes using Machine Learning , 2018, ArXiv.

[134]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.