Feature Split-based Information Extraction in the Field of Medicine

In recent years, more and more studies have been done on symptom information extraction. These studies are mostly based on clinical medical records, and they focus only on symptom entities, which are not sufficient to convey the full symptom information. This paper presents a feature split-based approach to extract symptom information from Chinese medicine instruction texts. In this approach, the symptom information is split into two parts: symptom subject entity and symptom manifestation entity. The main idea of this method is to automatically recognize the symptom subject and symptom manifestation first, and then add these two identification results as features to the symptom information extraction task. Through a series of experiments based on Conditional Random Fields (CRF)- an effective model proved by lots of experiments in the field of medicine, it is obvious that the feature split-based approach proposed in this paper can obtain higher accuracy and recall rate in symptom information extraction.

[1]  Jonathan M. Garibaldi,et al.  Automatic detection of protected health information from clinic narratives , 2015, J. Biomed. Informatics.

[2]  Li Chen,et al.  Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: An empirical study , 2014, J. Biomed. Informatics.

[3]  Li Yang,et al.  Exploring feature sets for two-phase biomedical named entity recognition using semi-CRFs , 2013, Knowledge and Information Systems.

[4]  Hua Xu,et al.  Research and applications: A comprehensive study of named entity recognition in Chinese clinical text , 2014, J. Am. Medical Informatics Assoc..

[5]  Li Chen,et al.  A Preliminary Work on Symptom Name Recognition from Free-Text Clinical Records of Traditional Chinese Medicine using Conditional Random Fields and Reasonable Features , 2012, BioNLP@HLT-NAACL.

[6]  Zhihua Liao,et al.  Biomedical Named Entity Recognition Based on Skip-Chain CRFS , 2012, 2012 International Conference on Industrial Control and Electronics Engineering.

[7]  Noémie Elhadad,et al.  Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts , 2013, J. Biomed. Informatics.

[8]  Hua Xu,et al.  A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries , 2011, J. Am. Medical Informatics Assoc..

[9]  Devanshu Jain,et al.  Supervised Named Entity Recognition for Clinical Data , 2015, CLEF.

[10]  Bin Fu,et al.  The Symptoms and Pathogenesis Entity Recognition of TCM Medical Records Based on CRF , 2015, UIC/ATC/ScalCom.

[11]  Yongchao Liu,et al.  A framework and its empirical study of automatic diagnosis of traditional Chinese medicine utilizing raw free-text clinical records , 2012, J. Biomed. Informatics.