Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach

MOTIVATION Network reconstruction of biological entities is very important for understanding biological processes and the organizational principles of biological systems. This work focuses on integrating both the literatures and microarray gene-expression data, and a combined literature mining and microarray analysis (LMMA) approach is developed to construct gene networks of a specific biological system. RESULTS In the LMMA approach, a global network is first constructed using the literature-based co-occurrence method. It is then refined using microarray data through a multivariate selection procedure. An application of LMMA to the angiogenesis is presented. Our result shows that the LMMA-based network is more reliable than the co-occurrence-based network in dealing with multiple levels of KEGG gene, KEGG Orthology and pathway. AVAILABILITY The LMMA program is available upon request.

[1]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[2]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[3]  Lyle H. Ungar,et al.  Using prior knowledge to improve genetic network reconstruction from microarray data , 2004, Silico Biol..

[4]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[5]  Ulf Hellman,et al.  Transforming growth factor‐β1‐regulated proteins in human endothelial cells identified by two‐dimensional gel electrophoresis and mass spectrometry , 2004 .

[6]  M. Reinders,et al.  Genetic network modeling. , 2002, Pharmacogenomics.

[7]  Jan Kitajewski,et al.  Wnt/β-Catenin Signaling Induces Proliferation, Survival and Interleukin-8 in Human Endothelial Cells , 2005, Angiogenesis.

[8]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[9]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Jin Zhang Information Retrieval and Visualization , 2008 .

[11]  C Zhang,et al.  MODELING OF NEUROENDOCRINE-IMMUNE NETWORK VIA SUBJECT ORIENTED LITERATURE MINING , 2004 .

[12]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[13]  Samuel A. Santoro,et al.  Endorepellin causes endothelial cell disassembly of actin cytoskeleton and focal adhesions through α2β1 integrin , 2004, The Journal of cell biology.

[14]  Marián Boguñá,et al.  Self-similarity of complex networks and hidden metric spaces , 2007, Physical review letters.

[15]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[16]  B J Stapley,et al.  Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[18]  Daniel Berleant,et al.  Mining MEDLINE: Abstracts, Sentences, or Phrases? , 2001, Pacific Symposium on Biocomputing.

[19]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[20]  P. Carmeliet Angiogenesis in health and disease , 2003, Nature Medicine.

[21]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[22]  J. Folkman Angiogenesis in cancer, vascular, rheumatoid and other disease , 1995, Nature Medicine.

[23]  Hagit Shatkay,et al.  Mining the Biomedical Literature in the Genomic Era: An Overview , 2003, J. Comput. Biol..

[24]  Jiahuai Han,et al.  Requisite Role of p38 MAPK in Mural Cell Recruitment during Angiogenesis in the Rat Aorta Model , 2003, Journal of Vascular Research.

[25]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[26]  S. Havlin,et al.  Self-similarity of complex networks , 2005, Nature.

[27]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[28]  Mark F McCarty,et al.  Targeting Multiple Signaling Pathways as a Strategy for Managing Prostate Cancer: Multifocal Signal Modulation Therapy , 2004, Integrative cancer therapies.

[29]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[30]  Nada Amin,et al.  Global architecture of genetic interactions on the protein network , 2003, Nature Biotechnology.

[31]  Kam D. Dahlquist,et al.  Regression Approaches for Microarray Data Analysis , 2002, J. Comput. Biol..

[32]  Gavin Sherlock,et al.  The Stanford Microarray Database: a user's guide. , 2006, Methods in molecular biology.

[33]  Ulf Hellman,et al.  Transforming growth factor-beta1-regulated proteins in human endothelial cells identified by two-dimensional gel electrophoresis and mass spectrometry. , 2004, Proteomics.

[34]  Chris Sander,et al.  Pathway information for systems biology , 2005, FEBS letters.

[35]  Shao Li,et al.  Combined Literature Mining and Gene Expression Analysis for Modeling Neuro-endocrine-immune Interactions , 2005, ICIC.

[36]  Mark W Tengowski,et al.  Combination Therapy Enhances the Inhibition of Tumor Growth with the Fully Human Anti–Type 1 Insulin-Like Growth Factor Receptor Monoclonal Antibody CP-751,871 , 2005, Clinical Cancer Research.

[37]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[38]  D. Mukhopadhyay,et al.  Multiple regulatory pathways of vascular permeability factor/vascular endothelial growth factor (VPF/VEGF) expression in tumors. , 2004, Seminars in cancer biology.

[39]  Tao Cai,et al.  Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary , 2005, Bioinform..

[40]  Hiroshi Mamitsuka,et al.  A probabilistic model for mining implicit 'chemical compound-gene' relations from literature , 2005, ECCB/JBI.

[41]  A. M. Goodwin,et al.  Wnt signaling in the vasculature , 2004, Angiogenesis.

[42]  Michael Weis,et al.  Role of cytokines in cardiovascular diseases: a focus on endothelial responses to inflammation. , 2005, Clinical science.

[43]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[44]  Luca Munaron,et al.  Blocking Ca2+entry: a way to control cell proliferation. , 2004, Current medicinal chemistry.

[45]  Ralf Zimmer,et al.  Expert knowledge without the expert: integrated analysis of gene expression and literature to derive active functional contexts , 2005, ECCB/JBI.

[46]  M. Gerritsen,et al.  Using gene expression profiling to identify the molecular basis of the synergistic actions of hepatocyte growth factor and vascular endothelial growth factor in human endothelial cells , 2003, British journal of pharmacology.

[47]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[48]  Jan Kitajewski,et al.  Wnt/beta-catenin signaling induces proliferation, survival and interleukin-8 in human endothelial cells. , 2005, Angiogenesis.