The keyword-based and semantic-driven data matching approach for assisting structuralizing the textual clinical documents

The clinical data stored in the health information system can be categorized as two types including structuralized data and non-structuralized ones. In the paper, a data extraction system is developed to assist data retrieval from the non-structuralized textual clinical documents such as radiology reports, pathology reports, etc. The system provides keyword-based and semantic-driven data matching methodology to extract the specific information from the textual clinical documents. The matching methodology provides the capabilities to recognize the selected keywords and the related semantics in the documents. Through the extraction verification interface, clinicians can extract and verify the matched information semi-automatically. The extracted data can be filled into predefined case-oriented templates. The structuralized data can be stored back into the clinical data warehouse for further analyzing. Moreover, the case-oriented templates can support collecting corresponding extracted data for various researches.