Unsupervised Resolution of Acronyms and Abbreviations in Nursing Notes Using Document-Level Context Models

Automatic simplification of clinical notes continues to be an important challenge for NLP systems. A frequent obstacle to developing more robust NLP systems for the clinical domain is the lack of annotated training data. This study investigates unsupervised techniques for one key aspect of medical text simplification, viz. the expansion and disambiguation of acronyms and abbreviations. Our approach combines statistical machine translation with document-context neural language models for the disambiguation of multi-sense terms. In addition we investigate the use of mismatched training data and self-training. These techniques are evaluated on nursing progress notes and obtain a disambiguation accuracy of 71.6% without any manual annotation effort.

[1]  Yeye He,et al.  Mining acronym expansions and their meanings using query click log , 2013, WWW.

[2]  Naoaki Okazaki,et al.  Clustering acronyms in biomedical text for disambiguation , 2006, LREC.

[3]  Genevieve B. Melton,et al.  Challenges and Practical Approaches with Word Sense Disambiguation of Acronyms and Abbreviations in the Clinical Domain , 2015, Healthcare informatics research.

[4]  M. Saeed Multiparameter Intelligent Monitoring in Intensive Care II ( MIMIC-II ) : A public-access intensive care unit database , 2011 .

[5]  Waleed Ammar,et al.  ICE-TEA: In-Context Expansion and Translation of English Abbreviations , 2011, CICLing.

[6]  Beatrice Podtschaske,et al.  Engaging patients through open notes: an evaluation using mixed methods , 2016, BMJ Open.

[7]  Ted Pedersen,et al.  Abbreviation and Acronym Disambiguation in Clinical Discourse , 2005, AMIA.

[8]  T. H. Kyaw,et al.  Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database* , 2011, Critical care medicine.

[9]  Hua Xu,et al.  Clinical Acronym/Abbreviation Normalization using a Hybrid Approach , 2013, CLEF.

[10]  Annie S. Wu,et al.  Identification, Expansion, and Disambiguation of Acronyms in Biomedical Texts , 2005, ISPA Workshops.

[11]  Andrea Berger,et al.  Inviting patients and care partners to read doctors’ notes: OpenNotes and shared access to electronic medical records , 2016, J. Am. Medical Informatics Assoc..

[12]  Joann G Elmore,et al.  Inviting Patients to Read Their Doctors' Notes: A Quasi-experimental Study and a Look Ahead , 2011, Annals of Internal Medicine.

[13]  Yaoyun Zhang,et al.  Clinical Abbreviation Disambiguation Using Neural Word Embeddings , 2015, BioNLP@IJCNLP.

[14]  Tom Delbanco,et al.  US experience with doctors and patients sharing clinical notes , 2015, BMJ : British Medical Journal.

[15]  Serguei V. S. Pakhomov,et al.  Automated Disambiguation of Acronyms and Abbreviations in Clinical Texts: Window and Training Size Considerations , 2012, AMIA.

[16]  Maria Skeppstedt,et al.  Synonym extraction and abbreviation expansion with ensembles of semantic spaces , 2014, Journal of Biomedical Semantics.

[17]  Ted Pedersen,et al.  Kernel Methods for Word Sense Disambiguation and Acronym Expansion , 2006, AAAI.

[18]  Gholamreza Haffari,et al.  Transductive learning for statistical machine translation , 2007, ACL.

[19]  Serguei V. S. Pakhomov Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts , 2002, ACL.

[20]  Naoaki Okazaki,et al.  Building a high-quality sense inventory for improved abbreviation disambiguation , 2010, Bioinform..

[21]  Stephen Pulman,et al.  Evaluating the State of the Art , 1995 .

[22]  Peter D. Turney,et al.  A Supervised Learning Approach to Acronym Identification , 2005, Canadian AI.

[23]  Toshihisa Takagi,et al.  Research Paper: ALICE: An Algorithm to Extract Abbreviations from MEDLINE , 2005, J. Am. Medical Informatics Assoc..

[24]  Mark Stevenson,et al.  Disambiguation of Biomedical Abbreviations , 2009, BioNLP@HLT-NAACL.

[25]  Amr Badr,et al.  A Language Modeling Approach for Acronym Expansion Disambiguation , 2015, CICLing.

[26]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[27]  Sanna Salanterä,et al.  Normalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe/CLEF eHealth Challenge 2013, Task 2 , 2016, J. Biomed. Semant..

[28]  Peng Liu,et al.  Normalization of Abbreviations/Acronyms: THCIB at CLEF eHealth 2013 Task 2 , 2013, CLEF.

[29]  Chris Dyer,et al.  Document Context Language Models , 2015, ICLR 2015.

[30]  Neil R. Smalheiser,et al.  ADAM: another database of abbreviations in MEDLINE , 2006, Bioinform..

[31]  Dana Dannélls,et al.  Automatic Acronym Recognition , 2006, EACL.

[32]  William W. Cohen,et al.  Alignment-HMM-based Extraction of Abbreviations from Biomedical Text , 2012, BioNLP@HLT-NAACL.