Clinical Text Prediction with Numerically Grounded Conditional Language Models

Assisted text input techniques can save time and effort and improve text quality. In this paper, we investigate how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word prediction and completion. These extensions incorporate a structured knowledge base and numerical values from the text into the context used to predict the next word. Our automated evaluation on a clinical dataset shows extended models significantly outperform standard models. Our best system uses both conditioning and grounding, because of their orthogonal benefits. For word prediction with a list of 5 suggestions, it improves recall from 25.03% to 71.28% and for word completion it improves keystroke savings from 34.35% to 44.81%, where theoretical bound for this dataset is 58.78%. We also perform a qualitative investigation of how models with lower perplexity occasionally fare better at the tasks. We found that at test time numbers have more influence on the document level than on individual word probabilities.

[1]  John Eng,et al.  Informatics in Radiology (infoRAD): radiology report entry with automatic phrase completion driven by language modeling. , 2004, Radiographics : a review publication of the Radiological Society of North America, Inc.

[2]  Nestor Garay-Vitoria,et al.  Text prediction systems: a survey , 2006, Universal Access in the Information Society.

[3]  Carina Silberer,et al.  Learning Grounded Meaning Representations with Autoencoders , 2014, ACL.

[4]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Larry P. Heck,et al.  Contextual LSTM (CLSTM) models for Large scale NLP tasks , 2016, ArXiv.

[6]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[7]  Tonio Wandmacher,et al.  Methods to Integrate a Language Model with Semantic Information for a Word Prediction Component , 2007, EMNLP.

[8]  Stephen Clark,et al.  Grounding Semantics in Olfactory Perception , 2015, ACL.

[9]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[10]  C. Shewan,et al.  Augmentative and Alternative Communication , 2020, Encyclopedia of Education and Information Technologies.

[11]  Knowledge-based compilation of magnetic resonance diagnosis reports in neuroradiology , 2012, 2012 25th IEEE International Symposium on Computer-Based Medical Systems (CBMS).

[12]  Zharko Aleksovski,et al.  SNOMED CT Saves Keystrokes: Quantifying Semantic Autocompletion. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[13]  Christopher D. Manning,et al.  Modeling Semantic Containment and Exclusion in Natural Language Inference , 2008, COLING.

[14]  Isabelle Augenstein,et al.  Numerically Grounded Language Models for Semantic Error Correction , 2016, EMNLP.

[15]  Yang Gong,et al.  Leveraging user's performance in reporting patient safety events by utilizing text prediction in narrative data entry , 2016, Comput. Methods Programs Biomed..

[16]  Ingmar Weber,et al.  Type less, find more: fast autocompletion search with a succinct index , 2006, SIGIR.

[17]  Hsiu-Hui Lee,et al.  Design and Implementation of Web-based Discharge Summary Note Based on Service-Oriented Architecture , 2010, Journal of Medical Systems.

[18]  Dan Roth,et al.  Reasoning about Quantities in Natural Language , 2015, TACL.

[19]  Rob C. van Ommering,et al.  Algorithmic and user study of an autocompletion algorithm on a large medical vocabulary , 2012, J. Biomed. Informatics.

[20]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Keith Trnka Adaptive Language Modeling for Word Prediction , 2008, ACL.

[22]  Constantine Stephanidis,et al.  Universal access in the information society , 1999, HCI.

[23]  Raul Sirel Dynamic User Interfaces for Synchronous Encoding and Linguistic Uniforming of Textual Clinical Data , 2012, Baltic HLT.

[24]  Deb Roy,et al.  Grounded Language Modeling for Automatic Speech Recognition of Sports Video , 2008, ACL.

[25]  M. Nakao,et al.  Numbers are better than words. Verbal specifications of frequency have no place in medicine. , 1983, The American journal of medicine.

[26]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Christian Lovis,et al.  Power of expression in the electronic patient record: structured data or narrative text? , 2000, Int. J. Medical Informatics.

[28]  Antal van den Bosch,et al.  Efficient context-sensitive word completion for mobile devices , 2008, Mobile HCI.

[29]  Philippe Langlais,et al.  Trans Type: Development-Evaluation Cycles to Boost Translator's Productivity , 2002, Machine Translation.

[30]  Stephen Clark,et al.  Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception , 2015, EMNLP.

[31]  Peter Haider,et al.  Predicting Sentences using N-Gram Language Models , 2005, HLT.

[32]  Christopher D. Manning,et al.  Finding Contradictions in Text , 2008, ACL.

[33]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[34]  Yang Gong,et al.  Text Prediction on Structured Data Entry in Healthcare , 2014, Applied Clinical Informatics.

[35]  Dan Roth,et al.  “Ask Not What Textual Entailment Can Do for You...” , 2010, ACL.

[36]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[37]  Mark D. Dunlop,et al.  Predictive text entry methods for mobile phones , 2000, Personal Technologies.

[38]  George F. Foster,et al.  User-Friendly Text Prediction For Translators , 2002, EMNLP.

[39]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Graeme Hirst,et al.  Testing the Efficacy of Part-of-Speech Information in Word Completion , 2003 .

[41]  Kathleen F. McCoy,et al.  Evaluating Word Prediction: Framing Keystroke Savings , 2008, ACL.

[42]  Der-Ming Liou,et al.  Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries , 2015, J. Am. Medical Informatics Assoc..

[43]  D. Timmermans,et al.  The Roles of Experience and Domain of Expertise in Using Numerical and Verbal Probability Terms in Medical Decisions , 1994, Medical decision making : an international journal of the Society for Medical Decision Making.