LIMSI @ CLEF eHealth 2015 - Task 1b

This paper presents LIMSI’s participation in the User-Centered Health Information Retrieval task (task 2) at the CLEF eHealth 2015 workshop. In our contribution we explored two different strategies to query expansion, i.e. one based on entity recognition using MetaMap and the UMLS, and a second strategy based on disease hypothesis generation using self-constructed external resources such a corpus of Wikipedia pages describing diseases and conditions, and web pages from the Medline Plus health portal. Our best-scoring run was a weighed UMLS-based run which put emphasis on incorporating signs and symptoms recognized in the topic text by MetaMap. This run achieved a P@10 score of 0.262 and nDCG@10 of 0.196, respectively.

[1]  Pierre Pluye,et al.  Shortcomings of health information on the Internet. , 2003, Health promotion international.

[2]  François Yvon,et al.  Practical Very Large Scale CRFs , 2010, ACL.

[3]  Thierry Hamon,et al.  CLEF eHealth Evaluation Lab 2015 Task 1b: Clinical Named Entity Recognition , 2015, CLEF.

[4]  Nigel Collier,et al.  Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[5]  Isabelle Stanton,et al.  Circumlocution in diagnostic medical queries , 2014, SIGIR.

[6]  Cyril Grouin,et al.  Overview of the CLEF eHealth Evaluation Lab 2015 , 2015, CLEF.

[7]  Ekrem Varoglu,et al.  Recognizing Biomedical Named Entities Using SVMs: Improving Recognition Performance with a Minimal Set of Features , 2006, KDLL.

[8]  Graciela Gonzalez,et al.  BANNER: An Executable Survey of Advances in Biomedical Named Entity Recognition , 2007, Pacific Symposium on Biocomputing.

[9]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[10]  Olivier Bodenreider,et al.  Exploring semantic groups through visual approaches , 2003, J. Biomed. Informatics.

[11]  Marti A. Hearst,et al.  A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text , 2002, Pacific Symposium on Biocomputing.

[12]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[13]  Thierry Hamon,et al.  Improving Term Extraction with Terminological Resources , 2006, FinTAL.

[14]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[15]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[16]  Robert H. Baud,et al.  UMLF: a Unified Medical Lexicon for French , 2005, AMIA.

[17]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[18]  Pierre Zweigenbaum,et al.  The Quaero French Medical Corpus : A Ressource for Medical Entity Recognition and Normalization , 2014 .