Automated methods for the summarization of electronic health records

Objectives This review examines work on automated summarization of electronic health record (EHR) data and in particular, individual patient record summarization. We organize the published research and highlight methodological challenges in the area of EHR summarization implementation. Target audience The target audience for this review includes researchers, designers, and informaticians who are concerned about the problem of information overload in the clinical setting as well as both users and developers of clinical summarization systems. Scope Automated summarization has been a long-studied subject in the fields of natural language processing and human–computer interaction, but the translation of summarization and visualization methods to the complexity of the clinical workflow is slow moving. We assess work in aggregating and visualizing patient information with a particular focus on methods for detecting and removing redundancy, describing temporality, determining salience, accounting for missing data, and taking advantage of encoded clinical knowledge. We identify and discuss open challenges critical to the implementation and use of robust EHR summarization systems.

[1]  Serguei V. S. Pakhomov,et al.  Longitudinal Analysis of New Information Types in Clinical Notes , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[2]  Gunther Schadow,et al.  Modeling the information-value decay of medical problems for problem list maintenance , 2010, IHI.

[3]  E. Tufte,et al.  Graphical summary of patient status , 1994, The Lancet.

[4]  Mark A. Musen,et al.  Applications of Ontology Design Patterns in Biomedical Ontologies , 2012, AMIA.

[5]  George Hripcsak,et al.  Using empiric semantic correlation to interpret temporal assertions in clinical texts. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[6]  Anders Grimsmo,et al.  Instant availability of patient records, but diminished availability of patient information: A multi-method study of GP's use of electronic patient records , 2008, BMC Medical Informatics Decis. Mak..

[7]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[8]  I. Kohane,et al.  Extracting Physician Group Intelligence from Electronic Health Records to Support Evidence Based Medicine , 2013, PloS one.

[9]  Michael Elhadad,et al.  Redundancy-Aware Topic Modeling for Patient Record Notes , 2014, PloS one.

[10]  Peter J. Haug,et al.  Exploiting missing clinical data in Bayesian network modeling for predicting medical problems , 2008, J. Biomed. Informatics.

[11]  S M Powsner,et al.  Summarizing clinical psychiatric data. , 1997, Psychiatric services.

[12]  Yuval Shahar,et al.  Intelligent visualization and exploration of time-oriented data of multiple patients , 2010, Artif. Intell. Medicine.

[13]  Noémie Elhadad,et al.  Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies , 2013, BMC Bioinformatics.

[14]  Casimir A. Kulikowski,et al.  Clinical Threading: Problem-Oriented Visual Summaries of Clinical Data , 2012, AMIA.

[15]  David K. Vawdrey,et al.  HARVEST, a longitudinal patient record summarizer , 2014, J. Am. Medical Informatics Assoc..

[16]  Regina Barzilay,et al.  Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[17]  George Hripcsak,et al.  Temporal reasoning with medical data - A review with emphasis on medical natural language processing , 2007, J. Biomed. Informatics.

[18]  Norman Poh,et al.  Modeling Rate of Change in Renal Function for Individual Patients : A Longitudinal Model Based on Routinely Collected Data , 2011 .

[19]  Alan L. Rector,et al.  The CLEF Chronicle: Patient Histories Derived from Electronic Health Records , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[20]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[21]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[22]  Noémie Elhadad,et al.  A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts , 2012, J. Biomed. Informatics.

[23]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches – Erratum , 2010, Natural Language Engineering.

[24]  Ben Shneiderman,et al.  LifeLines: visualizing personal histories , 1996, CHI.

[25]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[26]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[27]  Kim M. Unertl,et al.  Research Paper: Describing and Modeling Workflow and Information Flow in Chronic Disease Care , 2009, J. Am. Medical Informatics Assoc..

[28]  George Hripcsak,et al.  Research Paper: The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries , 2008, J. Am. Medical Informatics Assoc..

[29]  William M. Tierney,et al.  Creation and evaluation of EMR-based paper clinical summaries to support HIV-care in Uganda, Africa , 2010, Int. J. Medical Informatics.

[30]  Ion Androutsopoulos,et al.  A Survey of Paraphrasing and Textual Entailment Methods , 2009, J. Artif. Intell. Res..

[31]  Anna Rumshisky,et al.  Temporal reasoning over clinical text: the state of the art , 2013, J. Am. Medical Informatics Assoc..

[32]  T. Payne Computer decision support systems. , 2000, Chest.

[33]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[34]  Ricky K. Taira,et al.  Context-Based Electronic Health Record: Toward Patient Specific Healthcare , 2012, IEEE Transactions on Information Technology in Biomedicine.

[35]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[36]  Dean F. Sittig,et al.  Clinical Summarization Capabilities of Commercially-available and Internally-developed Electronic Health Records , 2012, Applied Clinical Informatics.

[37]  Caleb W. Hug,et al.  Predicting the risk and trajectory of intensive care patients using survival models , 2006 .

[38]  Fiona M. Callaghan,et al.  Use of internist's free time by ambulatory care Electronic Medical Record systems. , 2014, JAMA internal medicine.

[39]  Terrence Adam,et al.  A Qualitative Analysis of EHR Clinical Document Synthesis by Clinicians , 2012, AMIA.

[40]  Catalina Hallett,et al.  Multi-modal presentation of medical histories , 2008, IUI '08.

[41]  Cui Tao,et al.  Semantator: Semantic annotator for converting biomedical text to linked data , 2013, J. Biomed. Informatics.

[42]  Donald W. Simborg,et al.  Summary Time Oriented Record (STOR) , 1980 .

[43]  Vivian West,et al.  Innovative information visualization of electronic health record data: a systematic review , 2014, J. Am. Medical Informatics Assoc..

[44]  Michael G. Kahn,et al.  The visual display of temporal information , 1991, Artif. Intell. Medicine.

[45]  Albert Gatt,et al.  Summarising Complex ICU Data in Natural Language , 2008, AMIA.

[46]  Joanna Abraham,et al.  Comparative evaluation of the content and structure of communication using two handoff tools: implications for patient safety. , 2014, Journal of critical care.

[47]  George Hripcsak,et al.  Mining a clinical data warehouse to discover disease-finding associations using co-occurrence statistics , 2005, AMIA.

[48]  Herbert Chase,et al.  Cognitive analysis of the summarization of longitudinal patient records. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[49]  J L Rogers,et al.  The Impact of a Computerized Medical Record Summary System on Incidence and Length of Hospitalization , 1979, Medical care.

[50]  Panagiotis Stamatopoulos,et al.  Summarization from Medical Documents: A Survey , 2005, Artif. Intell. Medicine.

[51]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[52]  Ben Shneiderman,et al.  LifeLines: using visualization to enhance navigation and analysis of patient records , 1998, AMIA.

[53]  Donia Scott,et al.  Structural variation in generated health reports , 2005, IWP@IJCNLP.

[54]  Dean F Sittig,et al.  Health Care Transformation Through Collaboration on Open-Source Informatics Projects: Integrating a Medical Applications Platform, Research Data Repository, and Patient Summarization , 2013, Interactive journal of medical research.

[55]  Powsner Sm,et al.  Summarizing clinical psychiatric data. , 1997 .

[56]  Monique W. M. Jaspers,et al.  The think aloud method: a guide to user interface design , 2004, Int. J. Medical Informatics.

[57]  Paul D. Clayton,et al.  Use and Impact of a Computer-Generated Patient Summary Worksheet for Primary Care , 2005, AMIA.

[58]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[59]  Robert E. Hirschtick Copy-and-Paste , 2006 .

[60]  Denise R. Aberle,et al.  TimeLine: Visualizing Integrated Patient Records , 2007, IEEE Transactions on Information Technology in Biomedicine.

[61]  J. Schold,et al.  Prevalence of Copied Information by Attendings and Residents in Critical Care Progress Notes* , 2013, Critical care medicine.

[62]  C. McDonald Protocol-based computer reminders, the quality of care and the non-perfectability of man. , 1976, The New England journal of medicine.

[63]  J. Pathak,et al.  Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[64]  Guilherme Del Fiol,et al.  Text summarization in the biomedical domain: A systematic review of recent research , 2014, J. Biomed. Informatics.

[65]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[66]  André Kushniruk,et al.  Analysis of Complex Decision-Making Processes in Health Care: Cognitive Approaches to Health Informatics , 2001, J. Biomed. Informatics.

[67]  Noémie Elhadad,et al.  Corpus-Based Problem Selection for EHR Note Summarization. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[68]  Emily S. Patterson,et al.  Sources of variation in primary care clinical workflow: Implications for the design of cognitive support , 2014, Health Informatics J..

[69]  Noémie Elhadad,et al.  Natural Language Processing in Health Care and Biomedicine , 2014 .

[70]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[71]  G. Octo Barnett,et al.  Puya: a method of attracting attention to relevant physical findings , 1997, AMIA.

[72]  Wayne H. Ward,et al.  Towards Temporal Relation Discovery from the Clinical Narrative , 2009, AMIA.

[73]  Yuval Shahar,et al.  Distributed, intelligent, interactive visualization and exploration of time-oriented clinical data and their abstractions , 2006, Artif. Intell. Medicine.

[74]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[75]  Richard J. Holden,et al.  Cognitive performance-altering effects of electronic medical records: an application of the human factors paradigm for patient safety , 2011, Cognition, Technology & Work.

[76]  Hooshang Kangarloo,et al.  Problem-centric organization and visualization of patient imaging and clinical data. , 2009, Radiographics : a review publication of the Radiological Society of North America, Inc.

[77]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[78]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[79]  Vickie Nguyen,et al.  Falling through the cracks: information breakdowns in critical care handoff communication. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[80]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[81]  R. Logie,et al.  When a graph is poorer than 100 words: A comparison of computerised natural language generation, human generated descriptions and graphical displays in neonatal intensive care , 2010 .

[82]  Vimla L. Patel,et al.  Review: A Primer on Aspects of Cognition for Medical Informatics , 2001, J. Am. Medical Informatics Assoc..

[83]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[84]  Krzysztof Janowicz,et al.  Kinds of Contexts and their Impact on Semantic Similarity Measurement , 2008, 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom).

[85]  H J Suermondt,et al.  Automated identification of relevant patient information in a physician's workstation. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[86]  Herbert S. Lin,et al.  Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions , 2009 .

[87]  Craig K Enders,et al.  A primer on the use of modern missing-data methods in psychosomatic medicine research. , 2006, Psychosomatic medicine.

[88]  Ben Shneiderman,et al.  Interactive Information Visualization to Explore and Query Electronic Health Records , 2013, Found. Trends Hum. Comput. Interact..

[89]  Adam Wright,et al.  Leveraging electronic health records to support chronic disease management : the need for temporal data views , 2012 .

[90]  Laura A. Slaughter,et al.  A Comparison of Several Key Information Visualization Systems for Secondary Use of Electronic Health Record Content , 2010, Louhi@NAACL-HLT.

[91]  Rui Zhang,et al.  Impact of a prototype visualization tool for new information in EHR clinical documents. , 2012, Applied clinical informatics.

[92]  Olga M. Haring,et al.  Automating the Medical Record: Emerging Issues. , 1979 .

[93]  Eric Fosler-Lussier,et al.  Cross-narrative Temporal Ordering of Medical Events , 2014, ACL.

[94]  Norman Poh,et al.  Data-modelling and visualisation in chronic kidney disease (CKD): a step towards personalised medicine. , 2011, Informatics in primary care.

[95]  J F Fries,et al.  Alternatives in Medical Record Formats , 1974, Medical care.

[96]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[97]  Vimla L. Patel,et al.  Interface design for health care environments: the role of cognitive science , 1998, AMIA.

[98]  Enrique Alfonseca,et al.  DualSum: a Topic-Model based approach for update summarization , 2012, EACL.

[99]  Hongfang Liu,et al.  Research and applications: Patient-level temporal aggregation for text-based asthma status ascertainment , 2014, J. Am. Medical Informatics Assoc..

[100]  F R Rosendaal,et al.  Copy and paste , 2013, Journal of thrombosis and haemostasis : JTH.

[101]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[102]  Nate Blaylock,et al.  Building Timelines from Narrative Clinical Records: Initial Results Based-on Deep Natural Language Understanding , 2011, BioNLP@ACL.

[103]  Julia Adler-Milstein,et al.  A Survey of Health Information Exchange Organizations in the United States: Implications for Meaningful Use , 2011, Annals of Internal Medicine.

[104]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[105]  C Combi,et al.  Temporal reasoning and temporal data maintenance in medicine: Issues and challenges , 1997, Comput. Biol. Medicine.

[106]  George Hripcsak,et al.  Temporal Properties of Diagnosis Code Time Series in Aggregate , 2013, IEEE Journal of Biomedical and Health Informatics.

[107]  Peter D. Stetson,et al.  Content and Structure of Clinical Problem Lists: A Corpus Analysis , 2008, AMIA.

[108]  Richard Alterman,et al.  Understanding and summarization , 1991, Artificial Intelligence Review.

[109]  David W Bates,et al.  Can electronic clinical documentation help prevent diagnostic errors? , 2010, The New England journal of medicine.

[110]  Hongfang Liu,et al.  CliniViewer: A Tool for Viewing Electronic Medical Records Based on Natural Language Processing and XML , 2004, MedInfo.

[111]  Adam Wright,et al.  Summarization of clinical information: A conceptual model , 2011, J. Biomed. Informatics.

[112]  Chen Lin,et al.  Temporal Annotation in the Clinical Domain , 2014, TACL.

[113]  Vimla L. Patel,et al.  Identifying reasoning strategies in medical decision making: A methodological guide , 2005, J. Biomed. Informatics.

[114]  Serguei V. S. Pakhomov,et al.  Automated identification of relevant new information in clinical narrative , 2012, IHI '12.

[115]  Bridget T. McInnes,et al.  Evaluating measures of redundancy in clinical texts. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[116]  Christopher G. Chute,et al.  BioPortal: ontologies and integrated data resources at the click of a mouse , 2009, Nucleic Acids Res..

[117]  Noémie Elhadad,et al.  Identifying and mitigating biases in EHR laboratory tests , 2014, J. Biomed. Informatics.

[118]  Raymond Johnson,et al.  Advancing cognitive engineering methods to support user interface design for electronic health records , 2014, Int. J. Medical Informatics.

[119]  Adam Wright,et al.  Managing the Flood of Codes: maintaining patient problem lists in the era of Meaningful Use and ICD10 , 2012, AMIA.