An Accurate Deep Learning Model for Clinical Entity Recognition From Clinical Notes

The growing use of electronic health records in the medical domain results in generating a large amount of medical data that is stored in the form of clinical notes. These clinical notes are enriched with clinical entities like disease, treatment, tests, drugs, genes, and proteins. The extraction of clinical entities from clinical notes is a challenging task as clinical notes are written in the form of natural language. The extraction of clinical entities has many useful applications such as clinical notes analysis, medical data privacy, decision support systems, and disease analysis. Although various machine learning and deep learning models are developed to extract clinical entities from clinical notes, developing an accurate model is still challenging. This study presents a novel deep learning-based technique to extract the clinical entities from clinical notes. The proposed model uses local and global context to extract clinical entities in contrast to existing models that use only global context. The combination of CNN, Bi-LSTM, and CRF with non-complex embedding (proposed model) outperforms existing models by a margin of $4-10\%$ and $5-12\%$ in terms of F1-score on i2b2-2010 and i2b2-2012 data. The accurate detection of clinical entities can be helpful in the privacy preservation of medical data that increases the user's and medical organization's trust in sharing medical data.