1. Application of text mining to biomedical knowledge extraction: analyzing clinical narratives and medical literature

One of the tools that can aid researchers and clinicians in coping with the surfeit of biomedical information is text mining. In this chapter, we explore how text mining is used to perform biomedical knowledge extraction. By describing its main phases, we show how text mining can be used to obtain relevant information from vast online databases of health science literature and patients’ electronic health records. In so doing, we describe the workings of the four phases of biomedical knowledge extraction using text mining (text gathering, text preprocessing, text analysis, and presentation) entailed in retrieval of the sought information with a high accuracy rate. The chapter also includes an in depth analysis of the differences between clinical text found in electronic health records and biomedical text found in online journals, books, and conference papers, as well as a presentation of various text mining tools that have been developed in both university and commercial settings.

[1]  D. V. van Essen,et al.  Challenges and Opportunities in Mining Neuroscience Data , 2011, Science.

[2]  José Luís Oliveira,et al.  BeCAS: biomedical concept recognition services and visualization , 2013, Bioinform..

[3]  Wendy G. Lehnert,et al.  Inductive text classification for medical applications , 1995, J. Exp. Theor. Artif. Intell..

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory, Second Edition , 2000, Statistics for Engineering and Information Science.

[5]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[6]  Abdul Mateen Rajput,et al.  Automatic detection of adverse events to predict drug label changes using text and data mining techniques , 2013, Pharmacoepidemiology and drug safety.

[7]  Heri Ramampiaro,et al.  Retrieving BioMedical Information with BioTracer: Challenges and Possibilities , 2009 .

[8]  Lingling Yuan An Improved Naive Bayes Text Classification Algorithm In Chinese Information Processing , 2010 .

[9]  A. Valencia,et al.  Text Mining for Drugs and Chemical Compounds: Methods, Tools and Applications , 2011, Molecular informatics.

[10]  Christian Lovis,et al.  Power of expression in the electronic patient record: structured data or narrative text? , 2000, Int. J. Medical Informatics.

[11]  William R. Hersh,et al.  A Survey of Current Work in Biomedical Text Mining , 2005 .

[12]  Raymond J. Mooney,et al.  Text mining with information extraction , 2004 .

[13]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[14]  Zhenyu Liu,et al.  Knowledge-based query expansion to support scenario-specific retrieval of medical free text , 2005, SAC '05.

[15]  Laurianne Sitbon,et al.  Towards semantic search and inference in electronic medical records: An approach using concept--based information retrieval. , 2012, The Australasian medical journal.

[16]  Jun'ichi Tsujii,et al.  Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases , 2006, ACL.

[17]  Cheng Zhang,et al.  Biomedical text mining and its applications in cancer research , 2013, J. Biomed. Informatics.

[18]  Yong Chen,et al.  RBF Kernel Based Support Vector Machine with Universal Approximation and Its Application , 2004, ISNN.

[19]  S. Sagar Imambi,et al.  Classification of Medline documents using Global Relevant Weighing Schema , 2011 .

[20]  F. Schwartz,et al.  Using Clustering to Boost Text Classification , 2001 .

[21]  S. Sagar Imambi Building Classification System to Predict Risk factors of Diabetic Retinopathy Using Text mining , 2010 .

[22]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[23]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[24]  M. Ronaghi,et al.  Ontology-Based Meta-Analysis of Global Collections of High-Throughput Public Data , 2010, PloS one.

[25]  Xiong Zhong-yang Improved Naive Bayes Text Classification Algorithm , 2007 .

[26]  W. John Wilbur,et al.  PIE the search: searching PubMed literature for protein interaction information , 2012, Bioinform..

[27]  Anil K. Jain,et al.  Classification of text documents , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[28]  D. Rebholz-Schuhmann,et al.  Text-mining solutions for biomedical research: enabling integrative biology , 2012, Nature Reviews Genetics.

[29]  Chitta Baral,et al.  A SNPshot of PubMed to associate genetic variants with drugs, diseases, and adverse reactions , 2012, J. Biomed. Informatics.

[30]  Stephen H Walsh The clinician's perspective on electronic health records and how they can affect patient care , 2004, BMJ : British Medical Journal.

[31]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[32]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[33]  R. B. González,et al.  Index compression for information retrieval systems , 2008 .

[34]  Joshua C. Denny,et al.  Chapter 13: Mining Electronic Health Records in the Genomics Era , 2012, PLoS Comput. Biol..

[35]  Hongfang Liu,et al.  Disambiguating Ambiguous Biomedical Terms in Biomedical Narrative Text: An Unsupervised Method , 2001, J. Biomed. Informatics.

[36]  Kristof Coussement,et al.  Integrating the voice of customers through call center emails into a decision support system for churn prediction , 2008, Inf. Manag..

[37]  John Atkinson,et al.  Discovering implicit intention-level knowledge from natural-language texts , 2008, Knowl. Based Syst..

[38]  Hanna Suominen Machine Learning and Clinical Text. Supporting Health Information Flow , 2009 .

[39]  Alessandro Zanasi,et al.  Virtual Weapons for Real Wars: Text Mining for National Security , 2008, CISIS.

[40]  Junichi Tsujii,et al.  Event extraction for systems biology by text mining the literature. , 2010, Trends in biotechnology.

[41]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[42]  K. Murali,et al.  MedMeSH Summarizer: Text Mining for Gene Clusters , 2002, SDM.

[43]  Naoaki Okazaki,et al.  Kleio: a knowledge-enriched information retrieval system for biology , 2008, SIGIR '08.

[44]  Adam Wright,et al.  Summarization of clinical information: A conceptual model , 2011, J. Biomed. Informatics.

[45]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[46]  Padmini Srinivasan,et al.  Mining MEDLINE for implicit links between dietary substances and diseases , 2004, ISMB/ECCB.

[47]  Dina Demner-Fushman,et al.  Biomedical Text Mining: A Survey of Recent Progress , 2012, Mining Text Data.

[48]  Enrique Herrera-Viedma,et al.  Modeling the retrieval process for an information retrieval system using an ordinal fuzzy linguistic approach , 2001, J. Assoc. Inf. Sci. Technol..

[49]  P. Srinivasan,et al.  Mining MEDLINE: Postulating a Beneficial Role for Curcumin Longa in Retinal Diseases , 2004, HLT-NAACL 2004.

[50]  Jian Su,et al.  Recognizing Names in Biomedical Texts: a Machine Learning Approach , 2004 .

[51]  Anastasia N. Kastania,et al.  E-Health Systems Quality and Reliability: Models and Standards , 2010 .

[52]  Michael Krauthammer,et al.  Term identification in the biomedical literature , 2004, J. Biomed. Informatics.

[53]  João Paulo Silva Cunha,et al.  Medical Information Extraction: Information Extraction from Portuguese Hospital Discharge Letters , 2012 .

[54]  Stavros K Archondakis,et al.  E-Health Systems Quality and Reliability : Models and Standards , 2012 .

[55]  G. Pillai,et al.  SVM Based Decision Support System for Heart Disease Classification with Integer-Coded Genetic Algorithm to Select Critical Features , 2009 .

[56]  Kaija Saranto,et al.  Models, Standards and Structures of Nursing Documentation in European Countries , 2009, Nursing Informatics.

[57]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[58]  G. Walton,et al.  Information overload within the health care system: a literature review. , 2004, Health information and libraries journal.