Recognizing Question Entailment for Medical Question Answering

With the increasing heterogeneity and specialization of medical texts, automated question answering is becoming more and more challenging. In this context, answering a given medical question by retrieving similar questions that are already answered by human experts seems to be a promising solution. In this paper, we propose a new approach for the detection of similar questions based on Recognizing Question Entailment (RQE). In particular, we consider Frequently Asked Question (FAQs) as a valuable and widespread source of information. Our final goal is to automatically provide an existing answer if FAQ similar to a consumer health question exists. We evaluate our approach using consumer health questions received by the National Library of Medicine and FAQs collected from NIH websites. Our first results are promising and suggest the feasibility of our approach as a valuable complement to classic question answering approaches.

[1]  Fabio Massimo Zanzotto,et al.  Expanding textual entailment corpora fromWikipedia using co-training , 2010, PWNLP@COLING.

[2]  Ion Androutsopoulos,et al.  Learning Textual Entailment using SVMs and String Similarity Measures , 2007, ACL-PASCAL@ACL.

[3]  Asma Ben Abacha,et al.  Meta-Learning with Selective Data Augmentation for Medical Entity Recognition , 2016, Int. J. Comput. Linguistics Appl..

[4]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[5]  Zhiyong Lu,et al.  NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.

[6]  Pierre Zweigenbaum,et al.  Medical Entity Recognition: A Comparaison of Semantic and Statistical Methods , 2011, BioNLP@ACL.

[7]  A. Montoyo,et al.  MLEnt : The Machine Learning Entailment System of the University of Alicante , 2006 .

[8]  Ion Androutsopoulos,et al.  A Survey of Paraphrasing and Textual Entailment Methods , 2009, J. Artif. Intell. Res..

[9]  Yong Yu,et al.  Searching Questions by Identifying Question Topic and Question Focus , 2008, ACL.

[10]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[11]  Guo-Qiang Zhang,et al.  SimQ: Real-Time Retrieval of Similar Consumer Health Questions , 2015, Journal of medical Internet research.

[12]  Alessandro Moschitti,et al.  A machine learning approach to textual entailment recognition , 2009, Natural Language Engineering.

[13]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[14]  Kristian J. Hammond,et al.  Question Answering from Frequently Asked Question Files: Experiences with the FAQ FINDER System , 1997, AI Mag..

[15]  Ido Dagan,et al.  Entailment-based Text Exploration with Application to the Health-care Domain , 2012, ACL.

[16]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.

[17]  Kentaro Inui,et al.  Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing , 2007, ACL 2007.

[18]  Stelios Piperidis,et al.  Building a Greek corpus for Textual Entailment , 2008, LREC.

[19]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[20]  Jeroen Groenendijk,et al.  On the semantics of questions and the pragmatics of answers , 1984 .

[21]  Ido Dagan,et al.  Recognizing Textual Entailment: Models and Applications , 2013, Recognizing Textual Entailment: Models and Applications.

[22]  P. Gorman,et al.  A taxonomy of generic clinical questions: classification study , 2000, BMJ : British Medical Journal.

[23]  Craige Roberts Information structure in discourse: Towards an integrated for-mal theory of pragmatics , 1996 .

[24]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[25]  David Martínez,et al.  Evaluating the state of the art in disorder recognition and normalization of the clinical narrative , 2014, J. Am. Medical Informatics Assoc..

[26]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[27]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.

[28]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[29]  Asma Ben Abacha,et al.  Semantic Analysis and Automatic Corpus Construction for Entailment Recognition in Medical Texts , 2015, AIME.