Overview of CCKS 2018 Task 1: Named Entity Recognition in Chinese Electronic Medical Records

The CCKS 2018 presented a named entity recognition (NER) task focusing on Chinese electronic medical records (EMR). The Knowledge Engineering Group of Tsinghua University and Yidu Cloud Beijing Technology Co., Ltd. provided an annotated dataset for this task, which is the only publicly available dataset in the field of Chinese EMR. Using this dataset, 69 systems were developed for the task. The performance of the systems showed that the traditional CRF and Bi-LSTM model were the most popular models for the task. The system achieved the highest performance by combining CRF or Bi-LSTM model with complex feature engineering, indicating that feature engineering is still indispensable. These results also showed that the performance of the task could be augmented with rule-based systems to determine clinical named entities.

[1]  Sanna Salanterä,et al.  Overview of the ShARe/CLEF eHealth Evaluation Lab 2013 , 2013, CLEF.

[2]  Suresh Manandhar,et al.  SemEval-2014 Task 7: Analysis of Clinical Text , 2014, *SEMEVAL.

[3]  Yan Zhang,et al.  Category Multi-representation: A Unified Solution for Named Entity Recognition in Clinical Texts , 2018, PAKDD.

[4]  Hua Xu,et al.  A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries , 2011, J. Am. Medical Informatics Assoc..

[5]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[6]  Hongfei Lin,et al.  DUTIR at the CCKS-2018 Task1: A Neural Network Ensemble Approach for Chinese Clinical Named Entity Recognition , 2018, CCKS Tasks.

[7]  Joel D. Martin,et al.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 , 2011, J. Am. Medical Informatics Assoc..

[8]  Srinivasa Rao Kundeti,et al.  Clinical named entity recognition: Challenges and opportunities , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[9]  Wenkang Huang,et al.  A Conditional Random Fields Approach to Clinical Name Entity Recognition , 2018, CCKS Tasks.

[10]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[11]  Burr Settles,et al.  Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.

[12]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.