Knowledge Abstraction Matching for Medical Question Answering

Medical Question Answering (medical QA), which studies the problem of automatically answering patients' medical questions online, is one of the major applications of bioinformatics. Though many efforts have been made before, the medical QA system still deserves delicate algorithm optimization due to the serious application scenario and strict requirement for the answer quality. In this paper, we introduce a novel Knowledge Abstraction Matching (KAM) method for the medical QA problem. The intuition of KAM is that there are many frequent repeat text segments appearing in the answers across different questions. From this view, we propose a new method that consists of frequent segment $N$-gram mining, medical knowledge abstraction, medical segment matching and answer re-retrieval. KAM has been incorporated into Baidu's enterprise medical QA system MelodyQA deployed on the backend of Muzhi Doctor. The evaluation shows that the proposed method can generate more quality answers for MelodyQA with a significant improvement of question coverage under acceptable accuracy.

[1]  Cícero Nogueira dos Santos,et al.  Learning Hybrid Representations to Retrieve Semantically Equivalent Questions , 2015, ACL.

[2]  Sanda M. Harabagiu,et al.  Medical Question Answering for Clinical Decision Support , 2016, CIKM.

[3]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.

[4]  M. de Rijke,et al.  A syntax-aware re-ranker for microblog retrieval , 2014, SIGIR.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Shuohang Wang,et al.  A Compare-Aggregate Model for Matching Text Sequences , 2016, ICLR.

[7]  Jimmy J. Lin,et al.  Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks , 2016, CIKM.

[8]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[9]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[10]  Siu Cheung Hui,et al.  Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture , 2017, SIGIR.

[11]  Guido Zuccon,et al.  Integrating the Framing of Clinical Questions via PICO into the Retrieval of Medical Literature for Systematic Reviews , 2017, CIKM.

[12]  Noah A. Smith,et al.  Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions , 2010, NAACL.

[13]  W. Bruce Croft,et al.  aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model , 2016, CIKM.

[14]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[15]  Seung-won Hwang,et al.  KBQA: Learning Question Answering over QA Corpora and Knowledge Bases , 2019, Proc. VLDB Endow..

[16]  Mariana L. Neves,et al.  Neural Domain Adaptation for Biomedical Question Answering , 2017, CoNLL.

[17]  Manuel Palomar,et al.  A knowledge based method for the medical question answering problem , 2007, Comput. Biol. Medicine.

[18]  Chris Callison-Burch,et al.  Answer Extraction as Sequence Tagging with Tree Edit Distance , 2013, NAACL.

[19]  Oladimeji Farri,et al.  Clinical Question Answering using Key-Value Memory Networks and Knowledge Graph , 2016, TREC.

[20]  Xuanjing Huang,et al.  Convolutional Neural Tensor Network Architecture for Community-Based Question Answering , 2015, IJCAI.

[21]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[22]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[23]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[24]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[25]  Dragomir R. Radev,et al.  Mining the web for answers to natural language questions , 2001, CIKM '01.

[26]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.