From Witch’s Shot to Music Making Bones - Resources for Medical Laymen to Technical Language and Vice Versa

Many people share information in social media or forums, like food they eat, sports activities they do or events which have been visited. This also applies to information about a person's health status. Information we share online unveils directly or indirectly information about our lifestyle and health situation and thus provides a valuable data resource. If we can make advantage of that data, applications can be created that enable e.g. the detection of possible risk factors of diseases or adverse drug reactions of medications. However, as most people are not medical experts, language used might be more descriptive rather than the precise medical expression as medics do. To detect and use those relevant information, laymen language has to be translated and/or linked to the corresponding medical concept. This work presents baseline data sources in order to address this challenge for German. We introduce a new data set which annotates medical laymen and technical expressions in a patient forum, along with a set of medical synonyms and definitions, and present first baseline results on the data.

[1]  Sanna Salanterä,et al.  Overview of the ShARe/CLEF eHealth Evaluation Lab 2013 , 2013, CLEF.

[2]  Graciela Gonzalez-Hernandez,et al.  Pharmacovigilance on Twitter? Mining Tweets for Adverse Drug Reactions , 2014, AMIA.

[3]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[4]  Zhiyong Lu,et al.  DNorm: disease name normalization with pairwise learning to rank , 2013, Bioinform..

[5]  Maria Kvist,et al.  Medical text simplification using synonym replacement: Adapting assessment of word difficulty to a compounding language , 2014, PITR@EACL.

[6]  Cédric Bousquet,et al.  Signal Detection for Baclofen in Web Forums: A Preliminary Study , 2018, MIE.

[7]  Natalia Grabar,et al.  Automatic Extraction of Layman Names for Technical Medical Terms , 2014, 2014 IEEE International Conference on Healthcare Informatics.

[8]  Nigel Collier,et al.  Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation , 2016, ACL.

[9]  Noémie Elhadad,et al.  Mining a Lexicon of Technical Terms and Lay Equivalents , 2007, BioNLP@ACL.

[10]  Michael J. Paul,et al.  Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019 , 2019, Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task.

[11]  Sofiane Abbar,et al.  You Tweet What You Eat: Studying Food Consumption Through Twitter , 2014, CHI.

[12]  Alla Keselman,et al.  Making Texts in Electronic Health Records Comprehensible to Consumers: A Prototype Translator , 2007, AMIA.

[13]  Ulf Leser,et al.  Cross-lingual Candidate Search for Biomedical Concept Normalization , 2018, ArXiv.

[14]  Nigel Collier,et al.  Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages , 2015, EMNLP.

[15]  Zhiyong Lu,et al.  NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.

[16]  Michael J. Paul,et al.  Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018 , 2018, EMNLP 2018.

[17]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[18]  Hong Yu,et al.  Ranking Medical Terms to Support Expansion of Lay Language Resources for Patient Comprehension of Electronic Health Record Notes: Adapted Distant Supervision Approach , 2017, JMIR medical informatics.