A Comparison of Different Retrieval Strategies Working on Medical Free Texts

Patient information in health care systems mostly consists of textual data, and free text in particular makes up a significant amount of it. Information retrieval systems that concentrate on these text types have to deal with the different challenges these medical free texts pose to achieve an acceptable performance. This paper describes the evaluation of four different types of information retrieval strategies: keyword search, search performed by a medical domain expert, a semantic based information retrieval tool, and a purely statistical information retrieval method. The different methods are evaluated and compared with respect to its appliance in medical health care systems.

[1]  Hsinchun Chen,et al.  Exploring the use of concept spaces to improve medical information retrieval , 2000, Decis. Support Syst..

[2]  Naomi Sager,et al.  Research Paper: Natural Language Processing and the Representation of Clinical Data , 1994, J. Am. Medical Informatics Assoc..

[3]  Claire Fautsch,et al.  Adapting the tf idf vector-space model to domain specific information retrieval , 2010, SAC '10.

[4]  Kun Lu,et al.  Search strategies on a new health information retrieval system , 2010, Online Inf. Rev..

[5]  Paul Buitelaar,et al.  Semantic annotation for concept-based cross-language medical information retrieval , 2002, Int. J. Medical Informatics.

[6]  Euripides G. M. Petrakis,et al.  Information Retrieval by Semantic Similarity , 2006, Int. J. Semantic Web Inf. Syst..

[7]  William R. Hersh,et al.  A Survey of Current Work in Biomedical Text Mining , 2005 .

[8]  Markus Kreuzthaler,et al.  On the Need for Open-Source Ground Truths for Medical Information Retrieval Systems , 2010 .

[9]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[10]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[11]  L C Kingsland,et al.  Coach: applying UMLS knowledge sources in an expert searcher environment. , 1993, Bulletin of the Medical Library Association.

[12]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[13]  George Hripcsak,et al.  Reference Standards, Judges, and Comparison Subjects , 2002 .

[14]  Zhenyu Liu,et al.  Knowledge-based query expansion to support scenario-specific retrieval of medical free text , 2005, SAC '05.

[15]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[16]  Tefko Saracevic,et al.  Evaluation of evaluation in information retrieval , 1995, SIGIR '95.

[17]  Wesley W. Chu,et al.  Free-text medical document retrieval via phrase-based vector space model , 2002, AMIA.

[18]  Stephen P. Harter,et al.  Evaluation of information retrieval systems : Approaches, issues, and methods , 1997 .

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[21]  Andreas Holzinger,et al.  The Evaluation of Semantic Tools to Support Physicians in the Extraction of Diagnosis Codes , 2007, USAB.

[22]  F Wingert Morphologic analysis of compound words. , 1985, Methods of information in medicine.

[23]  M. Musen,et al.  Handbook of Medical Informatics , 2002 .

[24]  Li Bin,et al.  The retrieval effectiveness of medical information on the web , 2001, Int. J. Medical Informatics.

[25]  D Hüske-Kraus,et al.  Text Generation in Clinical Medicine – a Review , 2003, Methods of Information in Medicine.

[26]  Jacques Savoy,et al.  Searching in Medline: Query expansion and manual indexing evaluation , 2008, Inf. Process. Manag..

[27]  Stephen E. Robertson,et al.  On the Evaluation of IR Systems , 1992, Inf. Process. Manag..

[28]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[29]  E Killoran Electronic information retrieval by physicians and medical librarians. , 1999, JAMA.

[30]  Wessel Kraaij,et al.  MeSH Up: effective MeSH text classification for improved document retrieval , 2009, Bioinform..

[31]  Erik Börjesson,et al.  A vector model for perceived object rotation and translation in space , 1975, Psychological research.

[32]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[33]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[34]  W R Hersh,et al.  How well do physicians use electronic information retrieval systems? A framework for investigation and systematic review. , 1998, JAMA.

[35]  Andreas Holzinger,et al.  Semantic Information in Medical Information Systems: Utilization of Text Mining Techniques to Analyze Medical Diagnoses , 2008, J. Univers. Comput. Sci..

[36]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[37]  O Baujard,et al.  Trends in medical information retrieval on Internet. , 1998, Computers in biology and medicine.

[38]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[39]  Alan R. Aronson,et al.  Exploiting a Large Thesaurus for Information Retrieval , 1994, RIAO.