Summarizing Medical Conversations via Identifying Important Utterances

Summarization is an important natural language processing (NLP) task in identifying key information from text. For conversations, the summarization systems need to extract salient contents from spontaneous utterances by multiple speakers. In a special task-oriented scenario, namely medical conversations between patients and doctors, the symptoms, diagnoses, and treatments could be highly important because the nature of such conversation is to find a medical solution to the problem proposed by the patients. Especially consider that current online medical platforms provide millions of public available conversations between real patients and doctors, where the patients propose their medical problems and the registered doctors offer diagnosis and treatment, a conversation in most cases could be too long and the key information is hard to be located. Therefore, summarizations to the patients’ problems and the doctors’ treatments in the conversations can be highly useful, in terms of helping other patients with similar problems have a precise reference for potential medical solutions. In this paper, we focus on medical conversation summarization, using a dataset of medical conversations and corresponding summaries which were crawled from a well-known online healthcare service provider in China. We propose a hierarchical encoder-tagger model (HET) to generate summaries by identifying important utterances (with respect to problem proposing and solving) in the conversations. For the particular dataset used in this study, we show that high-quality summaries can be generated by extracting two types of utterances, namely, problem statements and treatment recommendations. Experimental results demonstrate that HET outperforms strong baselines and models from previous studies, and adding conversation-related features can further improve system performance.1

[1]  P. Drew,et al.  Talk at Work: Interaction in Institutional Settings. , 1994 .

[2]  Masao Utiyama,et al.  Incorporating Word Attention into Character-Based Word Segmentation , 2019, NAACL.

[3]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[4]  Maxine Eskénazi,et al.  Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability , 2017, SIGDIAL Conference.

[5]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[6]  Yonggang Wang,et al.  Improving Chinese Word Segmentation with Wordhood Memory Networks , 2020, ACL.

[7]  Yan Song,et al.  Studying Challenges in Medical Conversation with Structured Annotation , 2020, NLPMC.

[8]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[9]  Ming Zhou,et al.  HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization , 2019, ACL.

[10]  Yan Song,et al.  ChiMed: A Chinese Medical Corpus for Question Answering , 2019, BioNLP@ACL.

[11]  Amita Misra,et al.  Using Structured Representation and Data: A Hybrid Model for Negation and Sentiment in Customer Service Conversations , 2019, WASSA@NAACL-HLT.

[12]  Giuseppe Carenini,et al.  Extractive Summarization of Long Documents by Combining Global and Local Context , 2019, EMNLP.

[13]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[14]  Yan Song,et al.  Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions , 2018, LREC.

[15]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[16]  Stefan Trausan-Matu,et al.  Utterances Assessment in Chat Conversations , 2010 .

[17]  Joel R. Tetreault,et al.  Dialogue Act Classification with Context-Aware Self-Attention , 2019, NAACL.

[18]  Chih-Wei Chen,et al.  A context-aware approach for progression tracking of medical concepts in electronic medical records , 2015, J. Biomed. Informatics.

[19]  Tong Zhang,et al.  ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations , 2019, FINDINGS.

[20]  David Martínez,et al.  Automatic classification of sentences to support Evidence Based Medicine , 2011, BMC Bioinformatics.

[21]  Dilek Z. Hakkani-Tür,et al.  Long story short - Global unsupervised models for keyphrase based meeting summarization , 2010, Speech Commun..

[22]  Harshit Kumar,et al.  Dialogue Act Sequence Labeling using Hierarchical encoder with CRF , 2017, AAAI.

[23]  Jing Li,et al.  Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts , 2018, NAACL.

[24]  Savitha Ramasamy,et al.  Fast Prototyping a Dialogue Comprehension System for Nurse-Patient Conversations on Symptom Monitoring , 2019, NAACL.

[25]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[26]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[27]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[28]  Mo Yu,et al.  Self-Supervised Learning for Contextualized Extractive Summarization , 2019, ACL.

[29]  Shourya Roy,et al.  Automatic Identification of Important Segments and Expressions for Mining of Business-Oriented Conversations at Contact Centers , 2007, EMNLP.

[30]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[31]  Ulf Leser,et al.  Identifying Key Sentences for Precision Oncology Using Semi-Supervised Learning , 2018, BioNLP.

[32]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[33]  Yonggang Wang,et al.  Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge , 2020, ACL.

[34]  Kenichi Takahashi,et al.  Neural Utterance Ranking Model for Conversational Dialogue Systems , 2016, SIGDIAL Conference.

[35]  Jing Li,et al.  Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings , 2018, NAACL.

[36]  Giuseppe Carenini,et al.  Automatic Community Creation for Abstractive Spoken Conversations Summarization , 2017, NFiS@EMNLP.

[37]  Kalpana Raja,et al.  Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge , 2015, J. Biomed. Informatics.

[38]  Franck Dernoncourt,et al.  Neural Networks for Joint Sentence Classification in Medical Paper Abstracts , 2017, EACL.

[39]  Yan Song,et al.  Coding Structures and Actions with the COSTA Scheme in Medical Conversations , 2018, BioNLP.

[40]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[41]  Yan Song,et al.  Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation , 2012, LREC.

[42]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[43]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[44]  Jean Carletta,et al.  The AMI meeting corpus , 2005 .

[45]  Jeffrey D. Robinson Overall Structural Organization , 2012 .

[46]  Pierre Lison,et al.  Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models , 2017, SIGDIAL Conference.

[47]  Fei Xia,et al.  Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks , 2020, EMNLP.

[48]  Xiang Ao,et al.  Reading Like HER: Human Reading Inspired Extractive Summarization , 2019, EMNLP.

[49]  Jian Peng,et al.  emrQA: A Large Corpus for Question Answering on Electronic Medical Records , 2018, EMNLP.

[50]  James F. Allen,et al.  A Plan Recognition Model for Subdialogues in Conversations , 1987, Cogn. Sci..

[51]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.