论文信息 - IITP: A Supervised Approach for Disorder Mention Detection and Disambiguation

IITP: A Supervised Approach for Disorder Mention Detection and Disambiguation

In this paper we briefly describe our supervised machine learning approach for disorder mention detection system that we submitted as part of our participation in the SemEval-2014 Shared task. The main goal of this task is to build a system that automatically identifies mentions of clinical conditions from the clinical texts. The main challenge lies due in the fact that the same mention of concept may be represented in many surface forms. We develop the system based on the supervised machine learning algorithms, namely Conditional Random Field and Support Vector Machine. One appealing characteristics of our system is that most of the features for learning are extracted automatically from the given training or test datasets without using deep domain specific resources and/or tools. We submitted three runs, and best performing system is based on Conditional Random Field. For task A, it shows the precision, recall and F-measure values of 50.00%, 47.90% and 48.90%, respectively under the strict matching criterion. When the matching criterion is relaxed, it shows the precision, recall and F-measure of 81.50%, 79.70% and 80.60%, respectively. For task B, we obtain the accuracies of 33.30% and 69.60% for the relaxed and strict matches, respectively.

Asif Ekbal | Sriparna Saha | Utpal Kumar Sikdar

[1] Hua Xu,et al. Recognizing and Encoding Discorder Concepts in Clinical Text using Machine Learning and Vector Space Model , 2013, CLEF.

[2] Utpal Kumar Sikdar,et al. Domain-independent Model for Chemical Compound and Drug Name Recognition , 2013 .

[3] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[4] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5] Joe Carthy,et al. Medical Disorder Recognition with Structural Support Vector Machines , 2013, CLEF.

[6] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[7] Anthony N. Nguyen,et al. Identify Disorders in Health Records using Conditional Random Fields and Metamap AEHRC at ShARe/CLEF 2013 eHealth Evaluation Lab Task 1 , 2013, CLEF.