Online tools to support literature-based discovery in the life sciences

In biomedical research, the amount of experimental data and published scientific information is overwhelming and ever increasing, which may inhibit rather than stimulate scientific progress. Not only are text-mining and information extraction tools needed to render the biomedical literature accessible but the results of these tools can also assist researchers in the formulation and evaluation of novel hypotheses. This requires an additional set of technological approaches that are defined here as literature-based discovery (LBD) tools. Recently, several LBD tools have been developed for this purpose and a few well-motivated, specific and directly testable hypotheses have been published, some of which have even been validated experimentally. This paper presents an overview of recent LBD research and discusses methodology, results and online tools that are available to the scientific community.

[1]  Jonathan D. Wren,et al.  Extending the mutual information measure to rank inferred literature relationships , 2004, BMC Bioinformatics.

[2]  R. Pirskanen,et al.  Decreased mRNA expression of TNF‐α and IL‐10 in non‐stimulated peripheral blood mononuclear cells in myasthenia gravis , 2000, European journal of neurology.

[3]  Hao Chen,et al.  Content-rich biological network constructed by mining PubMed abstracts , 2004, BMC Bioinformatics.

[4]  Weiguo Fan,et al.  Literature-based discovery on the World Wide Web , 2002, TOIT.

[5]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[6]  Michael D. Gordon,et al.  Literature-based discovery by lexical statistics , 1999 .

[7]  Erik M. van Mulligen,et al.  Ambiguity of Human Gene Symbols in LocusLink and MEDLINE: Creating an Inventory and a Disambiguation Test Collection , 2003, AMIA.

[8]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[9]  Alfonso Valencia,et al.  Information extraction in molecular biology , 2002, Briefings Bioinform..

[10]  Marc Weeber,et al.  Case Report: Generating Hypotheses by Discovering Implicit Associations in the Literature: A Case Report of a Search for New Potential Therapeutic Uses for Thalidomide , 2003, J. Am. Medical Informatics Assoc..

[11]  Hongfang Liu,et al.  Pacific Symposium on Biocomputing 9:238-249(2004) BIOLOGICAL NOMENCLATURES: A SOURCE OF LEXICAL KNOWLEDGE AND AMBIGUITY , 2022 .

[12]  Toshihisa Takagi,et al.  Data and text mining Automatic extraction of gene / protein biological functions from biomedical text , 2005 .

[13]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[14]  Wanda Pratt,et al.  H.3.3 Information Search and Retrieval , 2022 .

[15]  Martijn J. Schuemie,et al.  Thesaurus-based disambiguation of gene symbols , 2005, BMC Bioinformatics.

[16]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[17]  Jonathan D. Wren,et al.  Knowledge discovery by automated identification and ranking of implicit relationships , 2004, Bioinform..

[18]  Pat Langley The computational support of scientific discovery , 2000, Int. J. Hum. Comput. Stud..

[19]  Saso Dzeroski,et al.  Supporting Discovery in Medicine by Association Rule Mining in Medline and UMLS , 2001, MedInfo.

[20]  Padmini Srinivasan,et al.  Mining MEDLINE for implicit links between dietary substances and diseases , 2004, ISMB/ECCB.

[21]  Marc Weeber,et al.  Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001, J. Assoc. Inf. Sci. Technol..

[22]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[23]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[24]  Carlos Santos,et al.  Data and text mining Wnt pathway curation using automated natural language processing : combining statistical methods with partial and full parse for knowledge extraction , 2005 .

[25]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[26]  Wanda Pratt,et al.  Interaction design for literature-based discovery , 2005, CHI Extended Abstracts.

[27]  N R Smalheiser,et al.  Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. , 1998, Computer methods and programs in biomedicine.

[28]  Christian Blaschke,et al.  Text Mining for Metabolic Pathways, Signaling Cascades, and Protein Networks , 2005, Science's STKE.

[29]  H. Kaminski,et al.  Pathophysiology of myasthenia gravis. , 2004, Seminars in neurology.

[30]  Carol Friedman,et al.  Introduction: named entity recognition in biomedicine , 2004, J. Biomed. Informatics.

[31]  D. Lindberg,et al.  The Unified Medical Language System , 1993, Methods of Information in Medicine.

[32]  Marc Weeber,et al.  Text-based discovery in biomedicine: the architecture of the DAD-system , 2000, AMIA.

[33]  G. Trinchieri,et al.  Inhibition of IL-12 production by thalidomide. , 1997, Journal of immunology.

[34]  Joyce A. Mitchell,et al.  Using literature-based discovery to identify disease candidate genes , 2005, Int. J. Medical Informatics.

[35]  Hongfang Liu,et al.  Gene name ambiguity of eukaryotic nomenclatures , 2005, Bioinform..

[36]  A critical assessment of text mining methods in molecular biology. Proceedings of a workshop. March 28-31, 2004. Granada, Spain. , 2005, BMC bioinformatics.

[37]  Neil R Smalheiser Informatics and hypothesis‐driven research , 2002, EMBO reports.

[38]  R. Pirskanen,et al.  Tumor necrosis factor-α, lymphotoxin, interleukin (IL)-6, IL-10, IL-12 and perforin mRNA expression in mononuclear cells in response to acetylcholine receptor is augmented in myasthenia gravis , 1996, Journal of Neuroimmunology.

[39]  Paul F. Bugni,et al.  A knowledgebase system to enhance scientific discovery: Telemakus , 2004, Biomedical digital libraries.

[40]  Jonathan D. Wren,et al.  Data-Mining Analysis Suggests an Epigenetic Pathogenesis for Type 2 Diabetes , 2005, Journal of biomedicine & biotechnology.

[41]  Kenneth A. Cory Discovering Hidden Analogies in an Online Humanities Database , 1999, Libr. Trends.

[42]  John G. Cleary,et al.  AZuRE, a scalable system for automated term disambiguation of gene and protein names , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[43]  Vladimir B. Bajic,et al.  Dragon TF Association Miner: a system for exploring transcription factor associations through text-mining , 2004, Nucleic Acids Res..

[44]  Erik M. van Mulligen,et al.  Constructing an associative concept space for literature-based discovery , 2004, J. Assoc. Inf. Sci. Technol..

[45]  G. Kaplan,et al.  Thalidomide exerts its inhibitory action on tumor necrosis factor alpha by enhancing mRNA degradation , 1993, The Journal of experimental medicine.

[46]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[47]  Michael D. Gordon,et al.  Toward Discovery Support Systems: A Replication, Re-Examination, and Extension of Swanson's Work on Literature-Based Discovery of a Connection between Raynaud's and Fish Oil , 1996, J. Am. Soc. Inf. Sci..

[48]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[49]  Erik M. van Mulligen,et al.  Facilitating networks of information , 2000, AMIA.

[50]  A. Fleischer,et al.  Thalidomide: current and potential clinical applications. , 2000, The American journal of medicine.

[51]  Neil R. Smalheiser,et al.  Artificial Intelligence An interactive system for finding complementary literatures : a stimulus to scientific discovery , 1995 .

[52]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.