Background: Patient’s history, family history, diagnoses, medications and other information concerning patient’s health and possible future treatment is usually incorporated in free-form narrative reports. Extracting relevant information helps giving the information to caretakers speaking other languages, utilizing modern techniques like reminding caretakers about conflicts with medical guidelines or collecting data for scientific use.
Objectives: The aim of this paper is to summarize the field of information extraction from free-form texts and to show results the author has achieved using simple methods for information extraction.
Methods: The lexical analysis and available Czech versions of medical codebooks were used in the first experiment.
Results: We show that narrative medical reports have a form so different from general texts and cannot be treated as general texts. Additionally available Czech codebooks were found insufficient to be used directly as dictionaries for term recognition.
Conclusions: New dictionaries of Czech medical terms need to be developed. Symbolic techniques have been found effective for recognition of pattern-specific values like Czech birth number or systolic/diastolic blood pressure values.
[1]
M. Sárek,et al.
Support for Electronic Health Records in Czech Law
,
2012
.
[2]
V Maojo,et al.
Integration of Relational and Textual Biomedical Sources
,
2009,
Methods of Information in Medicine.
[3]
P. Das-Purkayastha,et al.
Specialist Medical Abbreviations as a Foreign Language
,
2004,
Journal of the Royal Society of Medicine.
[4]
Kenric W. Hammond,et al.
Are Electronic Medical Records Trustworthy? Observations on Copying, Pasting and Duplication
,
2003,
AMIA.
[5]
M. Labský.
Information Extraction from Websites using Extraction Ontologies
,
2002
.
[6]
D R Griffin,et al.
Letters to the editor.
,
1974,
Science.