Overview of the MEDIQA-Chat 2023 Shared Tasks on the Summarization & Generation of Doctor-Patient Conversations

Automatic generation of clinical notes from doctor-patient conversations can play a key role in reducing daily doctors’ workload and improving their interactions with the patients. MEDIQA-Chat 2023 aims to advance and promote research on effective solutions through shared tasks on the automatic summarization of doctor-patient conversations and on the generation of synthetic dialogues from clinical notes for data augmentation. Seventeen teams participated in the challenge and experimented with a broad range of approaches and models. In this paper, we describe the three MEDIQA-Chat 2023 tasks, the datasets, and the participants’ results and methods. We hope that these shared tasks will lead to additional research efforts and insights on the automatic generation and evaluation of clinical notes.

[1]  Matthew R. Gormley,et al.  SummQA at MEDIQA-Chat 2023: In-Context Learning with GPT-4 for Medical Summarization , 2023, CLINICALNLP.

[2]  Samuel Osebe,et al.  UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations? , 2023, CLINICALNLP.

[3]  D. Teodoro,et al.  DS4DH at MEDIQA-Chat 2023: Leveraging SVM and GPT-3 Prompt Engineering for Medical Dialogue Classification and Summarization , 2023, CLINICALNLP.

[4]  Dhananjay Srivastava IUTEAM1 at MEDIQA-Chat 2023: Is simple fine tuning effective for multi layer summarization of clinical conversations? , 2023, Clinical Natural Language Processing Workshop.

[5]  Asma Ben Abacha,et al.  Aci-bench: a Novel Ambient Clinical Intelligence Dataset for Benchmarking Automatic Visit Note Generation , 2023, Scientific data.

[6]  Asma Ben Abacha,et al.  An Investigation of Evaluation Metrics for Automated Medical Note Generation , 2023, ArXiv.

[7]  Xiangru Tang,et al.  GersteinLab at MEDIQA-Chat 2023: Clinical Note Summarization from Doctor-Patient Conversations through Fine-tuning and In-context Learning , 2023, CLINICALNLP.

[8]  John Giorgi,et al.  WangLab at MEDIQA-Chat 2023: Clinical Note Generation from Doctor-Patient Conversations using Large Language Models , 2023, CLINICALNLP.

[9]  Noémie Elhadad,et al.  A Meta-Evaluation of Faithfulness Metrics for Long-Form Hospital-Course Summarization , 2023, ArXiv.

[10]  Alex Papadopoulos Korfiatis,et al.  Consultation Checklists: Standardising the Human Evaluation of Medical Note Generation , 2022, EMNLP.

[11]  Alex Papadopoulos Korfiatis,et al.  User-Driven Research of Medical Note Generation Software , 2022, NAACL-HLT.

[12]  Alex Papadopoulos Korfiatis,et al.  Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation , 2022, ACL.

[13]  Alex Papadopoulos Korfiatis,et al.  PriMock57: A Dataset Of Primary Care Mock Consultations , 2022, ACL.

[14]  Wen-wai Yim,et al.  Towards Automating Medical Scribing : Clinic Visit Dialogue2Note Sentence Alignment and Snippet Summarization , 2021, NLPMC.

[15]  Thibault Sellam,et al.  BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.

[16]  Aleksander Wawer,et al.  SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization , 2019, EMNLP.

[17]  Asma Ben Abacha,et al.  Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering , 2019, BioNLP@ACL.

[18]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[19]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[20]  Kyungtae Lim,et al.  Teddysum at MEDIQA-Chat 2023: an analysis of fine-tuning strategy for long dialog summarization , 2023, CLINICALNLP.

[21]  Ashwyn K Sharma,et al.  Team Cadence at MEDIQA-Chat 2023: Generating, augmenting and summarizing clinical dialogue with large language models , 2023, CLINICALNLP.

[22]  Ashutosh Kumar Singh,et al.  HealthMavericks@MEDIQA-Chat 2023: Benchmarking different Transformer based models for Clinical Dialogue Summarization , 2023, CLINICALNLP.

[23]  Yadan Fan,et al.  An Empirical Study of Clinical Note Generation from Doctor-Patient Encounters , 2023, EACL.

[24]  Abdou Youssef,et al.  Care4Lang at MEDIQA-Chat 2023: Fine-tuning Language Models for Classifying and Summarizing Clinical Dialogues , 2023, Clinical Natural Language Processing Workshop.

[25]  Kadir Bulut Özler,et al.  clulab at MEDIQA-Chat 2023: Summarization and classification of medical dialogues , 2023, CLINICALNLP.

[26]  Prakhar Mishra,et al.  NewAgeHealthWarriors at MEDIQA-Chat 2023 Task A: Summarizing Short Medical Conversation with Transformers , 2023, Clinical Natural Language Processing Workshop.

[27]  Kirill Milintsevich,et al.  Calvados at MEDIQA-Chat 2023: Improving Clinical Note Generation with Multi-Task Instruction Finetuning , 2023, CLINICALNLP.

[28]  Colin A. Grambow,et al.  In-Domain Pre-Training Improves Clinical Note Generation from Doctor-Patient Conversations , 2022, NLG4HEALTH.

[29]  Thomas Lin,et al.  MedicalSum: A Guided Clinical Abstractive Summarization Model for Generating Medical Reports from Patient-Doctor Conversations , 2022, EMNLP.

[30]  Yang Liu,et al.  DialogSum: A Real-Life Scenario Dialogue Summarization Dataset , 2021, FINDINGS.

[31]  Dina Demner-Fushman,et al.  Overview of the MEDIQA 2021 Shared Task on Summarization in the Medical Domain , 2021, BIONLP.