Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes

The records of a clinical encounter can be extensive and complex, thus placing a premium on tools that can extract and summarize relevant information. This paper introduces the task of generating discharge summaries for a clinical encounter. Summaries in this setting need to be faithful, traceable, and scale to multiple long documents, motivating the use of extract-then-abstract summarization cascades. We introduce two new measures, faithfulness and hallucination rate for evaluation in this task, which complement existing measures for fluency and informativeness. Results across seven medical sections and five models show that a summarization architecture that supports traceability yields promising results, and that a sentence-rewriting approach performs consistently on the measure used for faithfulness (faithfulness-adjusted F3) over a diverse range of generated sections.

[1]  Scott Lee,et al.  Natural language generation for electronic health records , 2018, npj Digital Medicine.

[2]  Li Yang,et al.  Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.

[3]  Jimmy J. Lin,et al.  Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering , 2006, ACL.

[4]  Tapio Salakoski,et al.  Artificial Intelligence in Medicine , 2022 .

[5]  Lukasz Kaiser,et al.  Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[6]  R. Cardinal,et al.  Generation and evaluation of artificial mental health records for Natural Language Processing , 2020, npj Digital Medicine.

[7]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[8]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[9]  L. Dyrbye,et al.  Physician burnout: contributors, consequences and solutions , 2018, Journal of internal medicine.

[10]  Adam Wright,et al.  Summarization of clinical information: A conceptual model , 2011, J. Biomed. Informatics.

[11]  Jennifer J. Liang,et al.  A Novel System for Extractive Clinical Note Summarization using EHR Data , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[12]  Nazli Goharian,et al.  Revisiting Summarization Evaluation for Scientific Articles , 2016, LREC.

[13]  Daniel King,et al.  ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[14]  Christine A. Sinsky,et al.  Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties , 2016, Annals of Internal Medicine.

[15]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[16]  Richard Socher,et al.  Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.

[17]  Alex Wang,et al.  Asking and Answering Questions to Evaluate the Factual Consistency of Summaries , 2020, ACL.

[18]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[19]  Siru Liu,et al.  Analysis of adult disease characteristics and mortality on MIMIC-III , 2020, PloS one.

[20]  Christopher D. Manning,et al.  Learning to Summarize Radiology Findings , 2018, Louhi@EMNLP.

[21]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[22]  Christopher D. Manning,et al.  Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports , 2020, ACL.

[23]  L. Dyrbye,et al.  Physician Burnout, Well‐being, and Work Unit Safety Grades in Relationship to Reported Medical Errors , 2018, Mayo Clinic proceedings.

[24]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[25]  Furu Wei,et al.  Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.

[26]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[27]  Mirella Lapata,et al.  Hierarchical Transformers for Multi-Document Summarization , 2019, ACL.

[28]  Fei Liu,et al.  Guiding Extractive Summarization with Question-Answering Rewards , 2019, NAACL.

[29]  Natalie Schluter,et al.  The limits of automatic summarisation according to ROUGE , 2017, EACL.

[30]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[31]  Nazli Goharian,et al.  Ontology-Aware Clinical Abstractive Summarization , 2019, SIGIR.

[32]  Ryan McDonald,et al.  On Faithfulness and Factuality in Abstractive Summarization , 2020, ACL.

[33]  Michael Weiner,et al.  The influence of information technology on patient-physician relationships , 2006, Journal of General Internal Medicine.

[34]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[35]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.