Ever-expanding volumes of biomedical text require automated semantic annotation techniques to curate and put to best use. An established field of research seeks to link mentions in text to knowledge bases such as those included in the UMLS (Unified Medical Language System), in order to enable a more sophisticated understanding. This work has yielded good results for tasks such as curating literature, but increasingly, annotation systems are more broadly applied. Medical vocabularies are expanding in size, and with them the extent of term ambiguity. Document collections are increasing in size and complexity, creating a greater need for speed and robustness. Furthermore, as the technologies are turned to new tasks, requirements change; for example greater coverage of expressions may be required in order to annotate patient records, and greater accuracy may be needed for applications that affect patients. This places new demands on the approaches currently in use. In this work, we present a new system, Bio-YODIE, and compare it to two other popular systems in order to give guidance about suitable approaches in different scenarios and how systems might be designed to accommodate future needs.
[1]
Kalina Bontcheva,et al.
Using @Twitter Conventions to Improve #LOD-Based Named Entity Disambiguation
,
2015,
ESWC.
[2]
Leon Derczynski,et al.
Analysis of Temporal Expressions Annotated in Clinical Notes
,
2015,
ACL 2015.
[3]
Tudor Groza,et al.
CogStack - Experiences of Deploying Integrated Information Retrieval and Extraction Services in a Large National Health Service Foundation Trust Hospital
,
2017
.
[4]
Jens Lehmann,et al.
DBpedia: A Nucleus for a Web of Open Data
,
2007,
ISWC/ASWC.
[5]
Quoc V. Le,et al.
Distributed Representations of Sentences and Documents
,
2014,
ICML.
[6]
Dina Demner-Fushman,et al.
MetaMap Lite: an evaluation of a new Java implementation of MetaMap
,
2017,
J. Am. Medical Informatics Assoc..
[7]
SoroaAitor,et al.
Graph-based Word Sense Disambiguation of biomedical documents
,
2010
.
[8]
Yasunori Yamamoto,et al.
Colil: a database and search service for citation contexts in the life sciences domain
,
2015,
J. Biomed. Semant..
[9]
Kalina Bontcheva,et al.
Text Processing with GATE
,
2011
.
[10]
Eneko Agirre,et al.
Graph-based Word Sense Disambiguation of biomedical documents
,
2010,
Bioinform..