An ontology-based approach for text mining of stroke electronic medical records

In this paper, we propose a novel ontology-based approach for text mining of EMR information retrieval. The advantage of this approach is that it is capable of handling numerous variations in nature text which essentially refer to the same identity, as well as inferring implicit information from the plain text, which are both important in data mining of medical records. We applied the approach to text mining of EMR documents for stroke patients in a Chinese medical hospital. A benchmark study on an independent test set shows that the proposed pipeline can accurately extract the vast majority of useful information from the EMR documents, including the implicit ones through ontology inference. We also carry out a primary statistical analysis on a sample EMR set to illustrate the utilization of the approach on medical studies.

[1]  Chenguang He,et al.  Toward Ubiquitous Healthcare Services With a Novel Efficient Cloud Platform , 2013, IEEE Transactions on Biomedical Engineering.

[2]  Myongho Yi,et al.  Effective Medical Resources Searching Using an Ontology-Driven Medical Information Retrieval System: H1N1 case study , 2012, Electron. Libr..

[3]  K. Furie,et al.  Clinical- and Imaging-Based Prediction of Stroke Risk After Transient Ischemic Attack: The CIP Model , 2009, Stroke.

[4]  Li Hao,et al.  Term Extraction and Negation Detection Method in Chinese Clinical Document , 2008 .

[5]  P. Rothwell,et al.  Risk of stroke early after transient ischaemic attack: a systematic review and meta-analysis , 2007, The Lancet Neurology.

[6]  P. Rothwell,et al.  A simple score (ABCD) to identify individuals at high early risk of stroke after transient ischaemic attack , 2005, The Lancet.

[7]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[8]  Koroshetz Wj,et al.  Tissue plasminogen activator for acute ischemic stroke. , 1996, The New England journal of medicine.

[9]  Joseph P. Broderick,et al.  Tissue plasminogen activator for acute ischemic stroke. The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. , 1995 .

[10]  L M Lau,et al.  A natural language understanding system combining syntactic and semantic techniques. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[11]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .