Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians

The published medical literature and online medical resources are important sources to help physicians make patient treatment decisions. Traditional sources used for information retrieval (e.g., PubMed) often return a list of documents in response to a user's query. Frequently the number of returned documents from large knowledge repositories is large and makes information seeking practical only "after hours" and not in the clinical setting. This study developed novel algorithms, and designed, implemented, and evaluated a medical definitional question answering system (MedQA). MedQA automatically analyzed a large number of electronic documents to generate short and coherent answers in response to definitional questions (i.e., questions with the format of "What is X?"). Our preliminary cognitive evaluation shows that MedQA out-performed three other online information systems (Google, OneLook, and PubMed) in two important efficiency criteria; namely, time spent and number of actions taken for a physician to identify a definition. It is our contention that question answering systems that aggregate pertinent information scattered across different documents have the potential to address clinical information needs within a timeframe necessary to meet the demands of clinicians.

[1]  Hong Yu,et al.  Towards Answering Biological Questions with Experimental Evidence: Automatically Identifying Text that Summarize Image Content in Full-Text Articles , 2006, AMIA.

[2]  Brian S Alper,et al.  Physicians Answer More Clinical Questions and Change Clinical Decisions More Often With Synthesized Evidence: A Randomized Trial in Primary Care , 2005, The Annals of Family Medicine.

[3]  L. Ohno-Machado Journal of Biomedical Informatics , 2001 .

[4]  Howard Frumkin,et al.  Medical information on the internet , 1997, Journal of General Internal Medicine.

[5]  W. Hersh,et al.  Factors associated with successful answering of clinical questions using an information retrieval system. , 2002, Bulletin of the Medical Library Association.

[6]  Kathleen R. McKeown,et al.  A Hybrid Approach for Answering Definitional Questions , 2003 .

[7]  Johanna I. Westbrook,et al.  Do online information retrieval systems help experienced clinicians answer clinical questions? , 2005, Journal of the American Medical Informatics Association : JAMIA.

[8]  F Gerr,et al.  Medical information on the Internet: a study of an electronic bulletin board. , 1997, Journal of general internal medicine.

[9]  James Jungho Pak,et al.  2 , 2009, NEMS.

[10]  Johanna I. Westbrook,et al.  Allied health professionals' use of online evidence: a survey of 790 staff working in the Australian public hospital system , 2004, Int. J. Medical Informatics.

[11]  Nigel Collier,et al.  An Annotation Scheme for a Rhetorical Analysis of Biology Articles , 2004, LREC.

[12]  Daniel Marcu,et al.  The rhetorical parsing, summarization, and generation of natural language texts , 1998 .

[13]  Mike Thelwall,et al.  Extracting macroscopic information from Web links , 2001, J. Assoc. Inf. Sci. Technol..

[14]  Suresh K. Bhavnani,et al.  Why is it difficult to find comprehensive information? Implications of information scatter for search and design: Research Articles , 2005 .

[15]  David L. Waltz,et al.  An English language question answering system for a large relational database , 1978, CACM.

[16]  Jianhua Li,et al.  Use of Online Resources While Using a Clinical Information System , 2003, AMIA.

[17]  Noémie Elhadad,et al.  Facilitating Physicians' Access to Information via Tailored Text Summarization , 2005, AMIA.

[18]  Steven K. Feiner,et al.  PERSIVAL, a system for personalized search and summarization over multimedia healthcare information , 2001, JCDL '01.

[19]  B. Ewigman,et al.  Answering family physicians' clinical questions using electronic medical databases. , 2001, The Journal of family practice.

[20]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[21]  Suresh K. Bhavnani,et al.  Why is it difficult to find comprehensive information? Implications of information scatter for search and design , 2005, J. Assoc. Inf. Sci. Technol..

[22]  D L Sackett,et al.  Applying evidence to the individual patient. , 1999, Annals of oncology : official journal of the European Society for Medical Oncology.

[23]  Keith Duncan,et al.  Cognitive Engineering , 2017, Encyclopedia of GIS.

[24]  Vimla L. Patel,et al.  Usability in the real world: assessing medical information technologies in patients' homes , 2003, J. Biomed. Informatics.

[25]  Alistair G. Sutcliffe,et al.  Towards a cognitive theory of information retrieval , 1998, Interact. Comput..

[26]  Hong Yu,et al.  Extracting synonymous gene and protein terms from biological literature , 2003, ISMB.

[27]  Hong Yu,et al.  Being Erlang Shen : Identifying Answerable Questions , 2005 .

[28]  Vimla L. Patel,et al.  Cognitive and usability engineering methods for the evaluation of clinical information systems , 2004, J. Biomed. Informatics.

[29]  Erwin P. Gianchandani,et al.  Flux balance analysis in the era of metabolomics , 2006, Briefings Bioinform..

[30]  Smaranda Muresan,et al.  Evaluation of the DEFINDER system for fully automatic glossary construction , 2001, AMIA.

[31]  Sasha Blair-Goldensohn,et al.  Answering Definitional Questions: A Hybrid Approach , 2004, New Directions in Question Answering.

[32]  Kristy Lundahl,et al.  Residents’ Patient-Specific Clinical Questions: Opportunities for Evidence-Based Learning , 2005, Academic medicine : journal of the Association of American Medical Colleges.

[33]  Michael Halliday,et al.  Cohesion in English , 1976 .

[34]  J. Powell,et al.  Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. , 2002, JAMA.

[35]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[36]  Jimmy J. Lin,et al.  Generative Content Models for Structural Analysis of Medical Abstracts , 2006, BioNLP@NAACL-HLT.

[37]  S. Satya‐Murti Evidence-based Medicine: How to Practice and Teach EBM , 1997 .

[38]  Petra Wilson,et al.  The quality of health information on the internet , 2002, BMJ : British Medical Journal.

[39]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[40]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[41]  H. J. Mcclung,et al.  The Internet as a source for current patient information. , 1998, Pediatrics.

[42]  Christian Köhler,et al.  How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews , 2002, BMJ : British Medical Journal.

[43]  Padmini Srinivasan,et al.  The Language of Bioscience: Facts, Speculations, and Statements In Between , 2004, HLT-NAACL 2004.

[44]  Ying Wei,et al.  The Semantics of a Definiendum Constrains both the Lexical Semantics and the Lexicosyntactic Patterns in the Definiens , 2006, BioNLP@NAACL-HLT.

[45]  Pierre Zweigenbaum,et al.  Towards a Medical Question-Answering System: a Feasibility Study , 2003, MIE.

[46]  Jimmy J. Lin User simulations for evaluating answers to question series , 2007, Inf. Process. Manag..

[47]  P. Gorman,et al.  A taxonomy of generic clinical questions: classification study , 2000, BMJ : British Medical Journal.

[48]  R. J. Cline,et al.  Consumer health information seeking on the Internet: the state of the art. , 2001, Health education research.

[49]  Nigel Collier,et al.  Zone Identification in Biology Articles as a Basis for Information Extraction , 2004, NLPBA/BioNLP.

[50]  J C Wyatt,et al.  Commentary: measuring quality and impact of the world wide web , 1997, BMJ.

[51]  M. Ebell,et al.  Analysis of questions asked by family doctors regarding patient care , 1999, BMJ.

[52]  Padmini Srinivasan,et al.  Categorization of Sentence Types in Medical Abstracts , 2003, AMIA.

[53]  Clarence D Kreiter,et al.  An evaluation of information-seeking behaviors of general pediatricians. , 2004, Pediatrics.

[54]  Min-Yen Kan,et al.  Customization in a unified framework for summarizing medical literature , 2005, Artif. Intell. Medicine.

[55]  P. Impicciatore,et al.  Reliability of health information for the public on the world wide web: systematic survey of advice on managing fever in children at home , 1997, BMJ.

[56]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[57]  Pierre Pluye,et al.  Shortcomings of health information on the Internet. , 2003, Health promotion international.

[58]  Eduard H. Hovy,et al.  Automated Text Summarization and the SUMMARIST System , 1998, TIPSTER.

[59]  Weiqing Wang,et al.  Exploring supervised and unsupervised methods to detect topics in biomedical text , 2006, BMC Bioinformatics.

[60]  Gary Marchionini,et al.  Information Seeking in Electronic Environments , 1995 .

[61]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[62]  G D Lundberg,et al.  Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. , 1997, JAMA.

[63]  Sue Childs,et al.  Judging the quality of internet‐based health information , 2005 .

[64]  Jimmy J. Lin,et al.  Evaluation of PICO as a Knowledge Representation for Clinical Questions , 2006, AMIA.

[65]  K Davison,et al.  The quality of dietary information on the World Wide Web. , 1997, Clinical performance and quality health care.

[66]  Luis Gravano,et al.  An investigation of linguistic features and clustering algorithms for topical document clustering , 2000, SIGIR '00.

[67]  Stephen B. Johnson,et al.  Scenario-based Assessment of Physicians' Information Needs , 2004, MedInfo.

[68]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[69]  Charles P. Friedman,et al.  Research Paper: Factors Associated with Success in Searching MEDLINE and Applying Evidence to Answer Clinical Questions , 2002, J. Am. Medical Informatics Assoc..

[70]  Loren H. Rieseberg,et al.  How reliable is science information on the web? , 1999, Nature.

[71]  Jerome A Osheroff,et al.  Research Paper: Answering Physicians' Clinical Questions: Obstacles and Potential Solutions , 2005, J. Am. Medical Informatics Assoc..

[72]  Halil Kilicoglu,et al.  Abstraction Summarization for Managing the Biomedical Research Literature , 2004, HLT-NAACL 2004.

[73]  H. Christensen,et al.  Quality of web based information on treatment of depression: cross sectional survey , 2000, BMJ : British Medical Journal.

[74]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[75]  A R Jadad,et al.  Rating health information on the Internet: navigating to knowledge or to Babel? , 1998, JAMA.

[76]  Jinxi Xu,et al.  TREC 2003 QA at BBN: Answering Definitional Questions , 2003, TREC.

[77]  Graeme Hirst,et al.  Analysis of Semantic Classes in Medical Text for Question Answering , 2004 .

[78]  J. L. Moore,et al.  Lifelong self-directed learning using a computer database of clinical questions. , 1997, The Journal of family practice.

[79]  H. Takeshita,et al.  Clinical Evidence at the Point of Care in Acute Medicine: A Handheld Usability Case Study , 2002 .

[80]  Fabio Rinaldi,et al.  Answering Questions in the Genomics Domain , 2004, ACL 2004.

[81]  E Glennie,et al.  The career of radiography: information on the web , 2006 .

[82]  Tat-Seng Chua,et al.  Generic soft pattern models for definitional question answering , 2005, SIGIR '05.

[83]  R. Kravitz,et al.  Health information on the Internet: accessibility, quality, and readability in English and Spanish. , 2001, JAMA.

[84]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[85]  A.,et al.  Cognitive Engineering , 2008, Encyclopedia of GIS.

[86]  Pierre Zweigenbaum Question answering in biomedicine , 2003 .

[87]  Edward A. Fox,et al.  Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries , 2001 .

[88]  Johanna I. Westbrook,et al.  Research Paper: Do clinicians use online evidence to support patient care? a study of 55, 000 clinicians , 2003, J. Am. Medical Informatics Assoc..

[89]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[90]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[91]  H.B. Michaelson,et al.  How to write and publish a scientific paper , 1981, Proceedings of the IEEE.

[92]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[93]  Pierre Zweigenbaum Question answering in biomedicine , 2003 .

[94]  Pierre Zweigenbaum,et al.  Indexing UMLS Semantic Types for Medical Question-Answering , 2005, MIE.

[95]  R. Sutcliffe,et al.  A Qualitative Comparison of Scientific and Journalistic Texts from the Perspective of Extracting Definitions , 2004 .