Customization in a unified framework for summarizing medical literature

OBJECTIVE We present the summarization system in the PErsonalized Retrieval and Summarization of Images, Video and Language (PERSIVAL) medical digital library. Although we discuss the context of our summarization research within the PERSIVAL platform, the primary focus of this article is on strategies to define and generate customized summaries. METHODS AND MATERIAL Our summarizer employs a unified user model to create a tailored summary of relevant documents for either a physician or lay person. The approach takes advantage of regularities in medical literature text structure and content to fulfill identified user needs. RESULTS The resulting summaries combine both machine-generated text and extracted text that comes from multiple input documents. Customization includes both group-based modeling for two classes of users, physician and lay person, and individually driven models based on a patient record. CONCLUSIONS Our research shows that customization is feasible in a medical digital library.

[1]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[2]  Paul S. Jacobs,et al.  PHRED: A Generator for Natural Language Interfaces , 1985, Comput. Linguistics.

[3]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[4]  Eduard Hovy,et al.  The Potential and Limitations of Automatic Sentence Extraction for Summarization , 2003, HLT-NAACL 2003.

[5]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[6]  Eduard Hovy,et al.  The Potential and Limitations of Sentence Extraction for Summarization , 2003 .

[7]  W Pratt,et al.  Physician's information customizer (PIC): using a shareable user model to filter the medical literature. , 1995, Medinfo. MEDINFO.

[8]  Brigitte Endres-Niggemeyer A Grounded Theory Approach to Expert Summarizing , 1998 .

[9]  George Hripcsak,et al.  WebCIS: large scale deployment of a Web-based clinical information system , 1999, AMIA.

[10]  R. Kravitz,et al.  Health information on the Internet: accessibility, quality, and readability in English and Spanish. , 2001, JAMA.

[11]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[12]  P D Clayton,et al.  Open architecture and integrated information at Columbia-Presbyterian Medical Center. , 1992, M.D. computing : computers in medical practice.

[13]  Min-Yen Kan,et al.  Applying Natural Language Generation to Indicative Summarization , 2001, EWNLG@ACL.

[14]  Jr. Frederick P. Brooks,et al.  The mythical man-month (anniversary ed.) , 1995 .

[15]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[16]  W. DuMouchel,et al.  Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing , 1995, Annals of Internal Medicine.

[17]  Kathleen R. McKeown Generating Patient-Specific Summaries of Online Literature , 2002 .

[18]  Owen Rambow,et al.  On the need for domain communication knowledge , 1991 .

[19]  Dragomir R. Radev Learning Correlations between Linguistic Indicators and Semantic Constraints: Reuse of Context-Dependent Descriptions of Entities , 1998, ACL.

[20]  Kathleen R. McKeown,et al.  Towards generating patient specific summaries of medical articles , 2001 .

[21]  P. Donnan,et al.  Cost effectiveness of computer tailored and non-tailored smoking cessation letters in general practice: randomised controlled trial , 2001, BMJ : British Medical Journal.

[22]  Jakob Nielsen,et al.  Iterative user-interface design , 1993, Computer.

[23]  Kathleen R. McKeown,et al.  SIMFINDER: A Flexible Clustering Tool for Summarization , 2001 .

[24]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part I. Background and Theory , 1997, J. Documentation.

[25]  Barry Smith,et al.  Proceedings of the AMIA Symposium , 2005 .

[26]  Daniel Marcu,et al.  Discourse Trees Are Good Indicators of Importance in Text , 1999 .

[27]  Debashish Niyogi,et al.  A knowledge-based approach to deriving logical structure from document images , 1995 .

[28]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[29]  Alison Cawsey,et al.  The Evaluation of a Personalised Health Information System for Patients with Cancer , 2000, User Modeling and User-Adapted Interaction.

[30]  John M. Carroll,et al.  Making Use: Scenario-Based Design of Human-Computer Interactions , 2000 .

[31]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[32]  John E. Hopcroft,et al.  Automatic Discovery of Logical Document Structure , 1998 .

[33]  C. Lindberg The Unified Medical Language System (UMLS) of the National Library of Medicine. , 1990, Journal.

[34]  Timothy W. Finin,et al.  Modeling the User in Natural Language Systems , 1988, CL.

[35]  Christine L. Borgman,et al.  Why are Online Catalogs Hard to Use? Lessons Learned from Information=Retrieval Studies , 1986 .

[36]  Chris D. Paice,et al.  Constructing literature abstracts by computer: Techniques and prospects , 1990, Inf. Process. Manag..

[37]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[38]  Steven K. Feiner,et al.  The AIL automated interface layout system , 2002, IUI '02.

[39]  Steven P. Abney Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[40]  Jianying Hu,et al.  Document image layout comparison and classification , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[41]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[42]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[43]  Michel Beaudouin-Lafon,et al.  Hypermedia exploration with interactive dynamic maps , 1995, Int. J. Hum. Comput. Stud..

[44]  Simone Teufel,et al.  Collection and linguistic processing of a large-scale corpus of medical articles , 2002, LREC.

[45]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[46]  Dragomir R. Radev A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure , 2000, SIGDIAL Workshop.

[47]  Min-Yen Kan,et al.  Combining Visual Layout and Lexical Cohesion Features for Text Segmentation , 2001 .

[48]  Dragomir R. Radev,et al.  Generating Natural Language Summaries from Multiple On-Line Sources , 1998, CL.

[49]  Christine L. Borgman,et al.  Why are online catalogs hard to use? Lessons learned from information-retrieval studies , 1986, J. Am. Soc. Inf. Sci..

[50]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[51]  Ehud Reiter,et al.  Tailored Patient Information: Some Issues and Questions , 1997, ArXiv.

[52]  H. J. Mcclung,et al.  The Internet as a source for current patient information. , 1998, Pediatrics.

[53]  Inderjeet Mani,et al.  Summarizing Similarities and Differences Among Related Documents , 1997, Information Retrieval.

[54]  Steven K. Feiner,et al.  PERSIVAL, a system for personalized search and summarization over multimedia healthcare information , 2001, JCDL '01.

[55]  Kathleen R. McKeown,et al.  Automatic text summarization as applied to information retrieval: using indicative and informative summaries , 2003 .

[56]  Brigitte Endres-Niggemeyer,et al.  Scenario Forms for Web Information Seeking and Summarizing in Bone Marrow Transplantation , 2002, COLING 2002.

[57]  Thomas Bell REDUCING HOSPITAL ADMISSION THROUGH COMPUTER SUPPORTED EDUCATION FOR ASTHMA PATIENTS , 1995, Pediatrics.

[58]  Christof Monz Document Fusion for Comprehensive Event Description , 2001, HTLKM@ACL.

[59]  Jarmo Laaksolahti,et al.  ConCall: edited and adaptive information filtering , 1998, IUI '99.

[60]  Vimla L. Patel,et al.  Usability testing in medical informatics: cognitive approaches to evaluation of information systems and user interfaces , 1997, AMIA.

[61]  Karen Spärck Jones Automatic summarising: factors and directions , 1998, ArXiv.

[62]  G Carenini,et al.  Generating patient-specific interactive natural language explanations. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[63]  Fred P. Brooks,et al.  The Mythical Man-Month , 1975, Reliable Software.

[64]  Alison Cawsey,et al.  Adapting Web-Based Information to the Needs of Patients with Cancer , 2000, AH.

[65]  Stephen B. Johnson,et al.  Accessing Heterogeneous Sources of Evidence to Answer Clinical Questions , 2001, J. Biomed. Informatics.

[66]  Kim Binsted,et al.  Generating Personalised Patient Information Using the Medical Record , 1995, AIME.

[67]  Luis Gravano,et al.  SDLIP + STARTS = SDARTS a protocol and toolkit for metasearching , 2001, JCDL '01.