An Empirical Study of UMLS Concept Extraction from Clinical Notes using Boolean Combination Ensembles

Objective To investigate behavior of Boolean operators on combining annotation output from multiple Natural Language Processing (NLP) systems across multiple corpora and to assess how filtering by aggregation of Unified Medical Language System (UMLS) Metathesaurus concepts affects system performance for Named Entity Recognition (NER) of UMLS concepts. Materials and methods Three corpora annotated for UMLS concepts were used: 2010 i2b2 VA challenge set (31,161 annotations), Multi-source Integrated Platform for Answering Clinical Questions (MiPACQ) corpus (17,457 annotations including UMLS concept unique identifiers), and Fairview Health Services corpus (44,530 annotations). Our framework combines annotations generated by any number of NLP systems into an exhaustive set of ensembles using an approximate grid-search over combinations of Boolean operations. Performance of these Boolean combination ensembles was compared between all available named entity annotations to performance on annotation subsets filtered by UMLS semantic groups. Results We demonstrated how optimized Boolean combination ensembles were constructed using the Fairview corpus on the collection of UMLS concepts filtered by the group Procedures and how our gridsearch strategy identified 20 ensembles that outperformed all individual NLP systems for the group Chemicals & Drugs. We also showed that for UMLS concept matching, Boolean ensembling of the MiPACQ corpus trended towards higher performance over individual systems. Discussion Boolean combination ensembles outperformed single systems in most cases. Use of an approximate grid-search can help optimize the precision-recall tradeoff and can provide a set of heuristics for choosing an optimal set of ensembles. Conclusion Ensembling can improve NER performance over individual systems. The framework we developed can be used to tailor the choice of Boolean combination ensembles to a diverse set of tasks. Our results indicate that NER and concept mapping remain challenging problems for clinical NLP.

[1]  David Martínez,et al.  Evaluating the state of the art in disorder recognition and normalization of the clinical narrative , 2014, J. Am. Medical Informatics Assoc..

[2]  Alexa T. McCray The Unified Medical Language System. the Umls Semantic Network: The UMLS Semantic Network , 1989 .

[3]  Son Doan,et al.  Ensembles of NLP Tools for Data Element Extraction from Clinical Notes , 2016, AMIA.

[4]  Hongfang Liu,et al.  Using synthetic clinical data to train an HMM-based POS tagger , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[5]  Rodney D. Nielsen,et al.  Towards comprehensive syntactic and semantic annotations of the clinical narrative , 2013, J. Am. Medical Informatics Assoc..

[6]  Serguei V. S. Pakhomov,et al.  Natural language processing of prehospital emergency medical services trauma records allows for automated characterization of treatment appropriateness , 2020, The journal of trauma and acute care surgery.

[7]  Olivier Bodenreider,et al.  Aggregating UMLS Semantic Types for Reducing Conceptual Complexity , 2001, MedInfo.

[8]  Jon Atle Gulla,et al.  Semantic Drift in Ontologies , 2010, WEBIST.

[9]  Alvaro A. Cárdenas,et al.  Optimal ROC Curve for a Combination of Classifiers , 2007, NIPS.

[10]  Reed McEwan,et al.  Named Entity Recognition in Prehospital Trauma Care , 2019, MedInfo.

[11]  Hongfang Liu,et al.  Using ensembles of NLP engines without a common type system to improve abbreviation disambiguation , 2017, Summit on Clinical Research Informatics.

[12]  Luca Soldaini QuickUMLS: a fast, unsupervised approach for medical concept extraction , 2016 .

[13]  Sylvie Ratté,et al.  Comparison of MetaMap and cTAKES for entity extraction in clinical notes , 2018, BMC Medical Informatics and Decision Making.

[14]  Bradley N. Miller,et al.  Problem solving with algorithms and data structures using Python , 2005 .

[15]  Eric Brill,et al.  Classifier Combination for Improved Lexical Disambiguation , 1998, ACL.

[16]  Fermín L. Cruz,et al.  A Comparative Study of Classifier Combination Methods Applied to NLP Tasks , 2011, NLDB.

[17]  A. McCray The UMLS Semantic Network. , 1989 .

[18]  Adi V. Gundlapalli,et al.  Sophia: An Expedient UMLS Concept Extraction Annotator , 2014, AMIA.

[19]  Erik F. Tjong Kim Sang,et al.  Representing Text Chunks , 1999, EACL.

[20]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[21]  Xi Yang,et al.  Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting , 2019, J. Am. Medical Informatics Assoc..

[22]  Samy Bengio,et al.  Benchmarking Non-Parametric Statistical Tests , 2005, NIPS.

[23]  Leon Derczynski,et al.  Complementarity, F-score, and NLP Evaluation , 2016, LREC.

[24]  Halil Kilicoglu,et al.  Broad-coverage biomedical relation extraction with SemRep , 2020, BMC Bioinformatics.

[25]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[26]  Niklas Lavesson,et al.  Comparative Analysis of Voting Schemes for Ensemble-based Malware Detection , 2013, J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl..

[27]  Lauren M. Shea,et al.  Predicting Performance , 2012 .

[28]  Erik M. van Mulligen,et al.  Using an ensemble system to improve concept extraction from clinical records , 2012, J. Biomed. Informatics.

[29]  Jesse Davis,et al.  Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation , 2012, ICML.

[30]  Ellen Riloff,et al.  Stacked Generalization for Medical Concept Extraction from Clinical Notes , 2015, BioNLP@IJCNLP.

[31]  Hongfang Liu,et al.  Ensembles of natural language processing systems for portable phenotyping solutions , 2019, J. Biomed. Informatics.

[32]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..