Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering

This paper presents a hybrid approach to question answering in the clinical domain that combines techniques from summarization and information retrieval. We tackle a frequently-occurring class of questions that takes the form "What is the best drug treatment for X?" Starting from an initial set of MEDLINE citations, our system first identifies the drugs under study. Abstracts are then clustered using semantic classes from the UMLS ontology. Finally, a short extractive summary is generated for each abstract to populate the clusters. Two evaluations---a manual one focused on short answers and an automatic one focused on the supporting abstract---demonstrate that our system compares favorably to PubMed, the search system most widely used by physicians today.

[1]  D. Covell,et al.  Information needs in office practice: are they being met? , 1985, Annals of internal medicine.

[2]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[3]  P. Gorman,et al.  Can primary care physicians' questions be answered using the medical journal literature? , 1994, Bulletin of the Medical Library Association.

[4]  M. Chambliss,et al.  Answering clinical questions. , 1996, The Journal of family practice.

[5]  K. Jones Evidence Based Medicine—How to Practice and Teach EBM , 1996 .

[6]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[7]  K. Cogdill,et al.  First-year medical students' information needs and resource selection: responses to a clinical scenario. , 1997, Bulletin of the Medical Library Association.

[8]  M. Ebell,et al.  Analysis of questions asked by family doctors regarding patient care , 1999, BMJ.

[9]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[10]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[11]  James J. Cimino,et al.  Building a Knowledge Base to Support a Digital Library , 2001, MedInfo.

[12]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[13]  Vasileios Hatzivassiloglou,et al.  Leveraging a common representation for personalized search and summarization in a medical digital library , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[14]  S. D. De Groote,et al.  Measuring use patterns of online journals and databases. , 2003, Journal of the Medical Library Association : JMLA.

[15]  W. Bruce Croft,et al.  Generating hierarchical summaries for web searches , 2003, SIGIR '03.

[16]  Julio Gonzalo,et al.  An Empirical Study of Information Synthesis Task , 2004, ACL.

[17]  George R. Thoma,et al.  PubMed on Tap: Discovering Design Principles for Online Information Delivery to Handheld Computers , 2004, MedInfo.

[18]  Graeme Hirst,et al.  Analysis of Semantic Classes in Medical Text for Question Answering , 2004 .

[19]  Ellen M. Voorhees,et al.  Using Question Series to Evaluate Question Answering System Effectiveness , 2005, HLT.

[20]  Jimmy J. Lin,et al.  Knowledge Extraction for Clinical Question Answering: Preliminary Results , 2005 .

[21]  Jimmy J. Lin Evaluation of resources for question answering evaluation , 2005, SIGIR '05.

[22]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[23]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.