Intelligent Recognition of Named Entity in Electronic Medical Records
暂无分享,去创建一个
The named entity recognition in electronic medical records is very important for building and mining large-scale clinical data to serve the clinical decision-making.However,in China,there are few relative studies on this.In comparison to the existing entity recognition methods and models,this paper attempted to use a machine learning method based on conditional random field(CRF) model to intelligently recognize three common types of the named entity in Chinese medical records,they are diseases,clinical symptoms and operations.After analyzing the data characteristics of electronic medical records,a rich set of features was chosen,including linguistic symbol,part of speech,word formation pattern,word boundaries,and context feature.Then,a small-scale corpus was constructed and marked based on the electronic medical records,which were randomly selected from various hospital departments.Finally,three control experiments,with the help of a CRF algorithm implementation tool called CRF + +,were carried out.Through analyzing the effect of different features in the feature set on the ability of CRF model to automatically recognize the entities,we proposed some basic rules of the CRF feature selection and template design under Chinese medical records environment.In the control experiments,the best F-measures in each of three types of entities reached 92.67%,93.76% and 95.06%.