Mining literature for systems biology

Currently, literature is integrated in systems biology studies in three ways. Hand-curated pathways have been sufficient for assembling models in numerous studies. Second, literature is frequently accessed in a derived form, such as the concepts represented by the Medical Subject Headings (MeSH) and Gene Ontologies (GO), or functional relationships captured in protein-protein interaction (PPI) databases; both of these are convenient, consistent reductions of more complex concepts expressed as free text in the literature. Moreover, their contents are easily integrated into computational processes required for dealing with large data sets. Last, mining text directly for specific types of information is on the rise as text analytics methods become more accurate and accessible. These uses of literature, specifically manual curation, derived concepts captured in ontologies and databases, and indirect and direct application of text mining, will be discussed as they pertain to systems biology.

[1]  D. Lauffenburger,et al.  A Compendium of Signals and Responses Triggered by Prodeath and Prosurvival Cytokines*S , 2005, Molecular & Cellular Proteomics.

[2]  Trey Ideker,et al.  Damage recovery pathways in Saccharomyces cerevisiae revealed by genomic phenotyping and interactome mapping. , 2002, Molecular cancer research : MCR.

[3]  A. Valencia,et al.  The success (or not) of HUGO nomenclature , 2006, Genome Biology.

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  Mark Gerstein,et al.  Analyzing cellular biochemistry in terms of molecular networks. , 2003, Annual review of biochemistry.

[6]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[7]  G. Church,et al.  Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset , 2005, Genome Biology.

[8]  Halil Kilicoglu,et al.  Argument-predicate distance as a filter for enhancing precision in extracting predications on the genetic etiology of disease , 2006, BMC Bioinformatics.

[9]  Marc Vidal,et al.  Systematic interactome mapping and genetic perturbation analysis of a C. elegans TGF-beta signaling network. , 2004, Molecular cell.

[10]  Jonathan D. Wren,et al.  Knowledge discovery by automated identification and ranking of implicit relationships , 2004, Bioinform..

[11]  D. Banville Mining chemical structural information from the drug literature. , 2006, Drug discovery today.

[12]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[13]  Hongfang Liu,et al.  Gene name ambiguity of eukaryotic nomenclatures , 2005, Bioinform..

[14]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[15]  Fabien Campagne,et al.  Building a protein name dictionary from full text: a machine learning term extraction approach , 2005, BMC Bioinformatics.

[16]  Kathleen A. Kennedy,et al.  Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4 , 2006, Nature.

[17]  Olivier Bodenreider,et al.  Chapter 3 Lexical, terminological and ontological resources for biological text mining , 2006 .

[18]  William R. Hersh,et al.  A survey of current work in biomedical text mining , 2005, Briefings Bioinform..

[19]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[20]  Marc S Halfon,et al.  An Integrated Strategy for Analyzing the Unique Developmental Programs of Different Myoblast Subtypes , 2006, PLoS genetics.

[21]  See-Kiong Ng,et al.  BioContrasts: extracting and exploiting protein-protein contrastive relations from biomedical literature , 2005, Bioinform..

[22]  Jeremy M. Wolfe,et al.  26.5 brief comms NEW , 2005 .

[23]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[24]  Padmini Srinivasan,et al.  Mining MEDLINE for implicit links between dietary substances and diseases , 2004, ISMB/ECCB.

[25]  T. Ideker,et al.  Supporting Online Material for A Systems Approach to Mapping DNA Damage Response Pathways , 2006 .

[26]  G. Casari,et al.  A physical and functional map of the human TNF-alpha/NF-kappa B signal transduction pathway. , 2004, Nature cell biology.

[27]  Li Ni,et al.  A procedure for assessing GO annotation consistency , 2005, ISMB.

[28]  Edda Klipp,et al.  Systems Biology , 1994 .

[29]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[30]  Christian Blaschke,et al.  Status of text-mining techniques applied to biomedical text. , 2006, Drug discovery today.

[31]  Marc Vidal,et al.  Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis , 2005, Nature.

[32]  C. Pesce,et al.  Regulated cell-to-cell variation in a cell-fate decision system , 2005, Nature.

[33]  Padmini Srinivasan,et al.  Retrieval with gene queries , 2006, BMC Bioinformatics.

[34]  Douglas A. Lauffenburger,et al.  Bioengineering and Systems Biology , 2006, Annals of Biomedical Engineering.

[35]  Igor V. Tetko,et al.  The Mouse Functional Genome Database (MfunGD): functional annotation of proteins in the light of their cellular context , 2005, Nucleic Acids Res..

[36]  David Milward,et al.  Ontology-Based Interactive Information Extraction From Scientific Abstracts , 2005, Comparative and functional genomics.

[37]  L. Hood,et al.  A data integration methodology for systems biology: experimental verification. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  D. Lauffenburger,et al.  The Response of Human Epithelial Cells to TNF Involves an Inducible Autocrine Cascade , 2006, Cell.

[39]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[40]  Sarah Calvo,et al.  Systematic identification of human mitochondrial disease genes through integrative genomics , 2006, Nature Genetics.

[41]  Miguel A. Andrade-Navarro,et al.  Ranking the whole MEDLINE database according to a large training set using text indexing , 2005, BMC Bioinformatics.

[42]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[43]  Kwang-Hyun Cho,et al.  The influence of the signal dynamics of activated form of IKK on NF‐κB and anti‐apoptotic gene expressions: A systems biology approach , 2006, FEBS letters.

[44]  Michael Heylin,et al.  20 02 STARTING SALARY SURVEY: Salaries and employment for 2001-02 chemistry graduates show less slippage than the job market in general , 2003 .

[45]  Thomas Werner,et al.  The next generation of literature analysis: Integration of genomic analysis into text mining , 2005, Briefings Bioinform..

[46]  Lorraine K. Tanabe,et al.  Tagging gene and protein names in biomedical text , 2002, Bioinform..

[47]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[48]  M. Gerstein,et al.  Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. , 2004, Current opinion in microbiology.

[49]  T. Davidson,et al.  Searching the Literature Using Medical Subject Headings versus Text Word with PubMed , 2006, The Laryngoscope.

[50]  B. Palsson,et al.  Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. , 2003, Genome research.

[51]  J. Ferrell,et al.  Interlinked Fast and Slow Positive Feedback Loops Drive Reliable Cell Decisions , 2005, Science.

[52]  Joel D. Martin,et al.  PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine , 2003, BMC Bioinformatics.

[53]  Yngve Falck-Ytter,et al.  Searching the MEDLINE Literature Database through PubMed: A Short Guide , 2005, Oncology Research and Treatment.

[54]  Ulf Leser,et al.  Finding kinetic parameters using text mining. , 2004, Omics : a journal of integrative biology.

[55]  Gregory W Carter,et al.  Disentangling information flow in the Ras-cAMP signaling network. , 2006, Genome research.

[56]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[57]  Alfonso Valencia,et al.  Evaluation of BioCreAtIvE assessment of task 2 , 2005, BMC Bioinformatics.

[58]  Gavin MacBeath,et al.  Uncovering quantitative protein interaction networks for mouse PDZ domains using protein microarrays. , 2006, Journal of the American Chemical Society.

[59]  K. Cohen,et al.  Biomedical language processing: what's beyond PubMed? , 2006, Molecular cell.