LITL at CLEF eHealth2017: Automatic Classification of Death Reports

This paper describes the participation of a group of students supervised by two teachers to the CLEF eHealth 2017 campaign, task 1. The task involves the classication of death certicates in French and more precisely the labelling of each cause of death with the relevant ICD10 code. The system that performs the automatic coding is based on an information retrieval method using the Solr interface. Two runs were submitted according to whether the system distinguishes cases of multiple causes or not. The best performance was obtained with the system which distinguishes multiple causes, with a precision of 0.61 and a recall of 0.55.

[1]  Pierre Zweigenbaum,et al.  A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage , 2016, BioTxtM@COLING 2016.

[2]  Guido Zuccon,et al.  CLEF 2017 eHealth Evaluation Lab Overview , 2017, CLEF.

[3]  Pierre Zweigenbaum,et al.  A Lexical Method for Assisted Extraction and Coding of ICD-10 Diagnoses from Free Text Patient Discharge Summaries , 1999, AMIA.

[4]  Lina Fatima Soualmia,et al.  SIBM at CLEF eHealth Evaluation Lab 2016: Extracting Concepts in French Medical Texts with ECMT and CIMIND , 2016, CLEF.

[5]  Erik M. van Mulligen,et al.  Erasmus MC at CLEF eHealth 2016: Concept Recognition and Coding in French Texts , 2016, CLEF.

[6]  Patrick Ruch,et al.  BiTeM at CLEF eHealth Evaluation Lab 2016 Task 2: Multilingual Information Extraction , 2016, CLEF.

[7]  Pierre Zweigenbaum,et al.  LIMSI ICD10 coding Experiments on CépiDC Death Certificate Statements , 2016, CLEF.

[8]  Kent A. Spackman,et al.  The SNOMED clinical terms development process: refinement and analysis of content , 2002, AMIA.

[9]  Ludovic Tanguy,et al.  LITL at CLEF eHealth2016: recognizing Entities in French Biomedical Documents , 2016, CLEF.

[10]  Ludovic Tanguy,et al.  Natural language processing for aviation safety reports: From classification to interactive analysis , 2016, Comput. Ind..

[11]  Carlos Martínez,et al.  The freetext matching algorithm: a computer program to extract diagnoses and causes of death from unstructured text in electronic health records , 2012, BMC Medical Informatics and Decision Making.

[12]  K. Bretonnel Cohen,et al.  CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French , 2017, CLEF.

[13]  Julien Velcin,et al.  ECSTRA-INSERM @ CLEF eHealth2016-task 2: ICD10 Code Extraction from Death Certificates , 2016, CLEF.

[14]  K. Bretonnel Cohen,et al.  Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016 , 2016, CLEF.

[15]  Liyana Shuib,et al.  Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection , 2017, PloS one.