A preliminary analysis of negation in a Spanish clinical records dataset∗ Análisis preliminar de la negación en un conjunto de informes cĺınicos en español

We report an on-going analysis of negation in a collection of Spanish emergency admission notes. The corpus gathers 354,677 de-anonymized records (71,297,457 tokens). We explored negation contexts by means of a set of negation patterns adapted to Spanish. Once these patterns were extracted, we manually inspected corpus occurrences of the most frequent patterns. This allowed us to refine the negation patterns by including new lexical and structural variants. The long-term goal of this work is to develop a Negation Processing Module, which could be included in a pipeline for the analysis of medical documents.

[1]  Mike Conway,et al.  Extending the NegEx Lexicon for Multiple Languages , 2013, MedInfo.

[2]  Maria Skeppstedt,et al.  Negation detection in Swedish clinical text: An adaption of NegEx to Swedish , 2011, J. Biomed. Semant..

[3]  James J. Masanz,et al.  Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing , 2014, PloS one.

[4]  Hercules Dalianis,et al.  Retrieving disorders and findings: Results using SNOMED CT and NegEx adapted for Swedish , 2011 .

[5]  Lior Rokach,et al.  Negation recognition in medical narrative reports , 2008, Information Retrieval.

[6]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[7]  Cyril Grouin,et al.  Detecting negation of medical problems in French clinical notes , 2012, IHI '12.

[8]  B. Ortiga,et al.  The Use of Electronic Health Records in Spanish Hospitals , 2014, Health information management : journal of the Health Information Management Association of Australia.

[9]  Paloma Martínez,et al.  An Approach for Detecting Modality and Negation in Texts by Using Rule-based Techniques , 2012, CLEF.

[10]  Roser Morante,et al.  Annotating Negation in Spanish Clinical Texts , 2017 .

[11]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[12]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[13]  Ernestina Menasalvas Ruiz,et al.  An Approach to Detect Negation on Medical Documents in Spanish , 2014, Brain Informatics and Health.

[14]  Horacio Rodríguez,et al.  Syntactic methods for negation detection in radiology reports in Spanish , 2016, BioNLP@ACL.

[15]  Koldo Gojenola,et al.  On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions , 2015, J. Biomed. Informatics.

[16]  Montserrat Marimon,et al.  Annotation of negation in the IULA Spanish Clinical Record Corpus , 2017 .

[17]  Roser Morante,et al.  A Metalearning Approach to Processing the Scope of Negation , 2009, CoNLL.

[18]  Alicia Pérez,et al.  Medical Entity Recognition and Negation Extraction: Assessment of NegEx on Health Records in Spanish , 2017, IWBBIO.

[19]  Manuel J. Maña López,et al.  A machine-learning approach to negation and speculation detection in clinical texts , 2012, J. Assoc. Inf. Sci. Technol..