Use of Data-Biased Random Walks on Graphs for the Retrieval of Context-Specific Networks from Genomic Data

Extracting network-based functional relationships within genomic datasets is an important challenge in the computational analysis of large-scale data. Although many methods, both public and commercial, have been developed, the problem of identifying networks of interactions that are most relevant to the given input data still remains an open issue. Here, we have leveraged the method of random walks on graphs as a powerful platform for scoring network components based on simultaneous assessment of the experimental data as well as local network connectivity. Using this method, NetWalk, we can calculate distribution of Edge Flux values associated with each interaction in the network, which reflects the relevance of interactions based on the experimental data. We show that network-based analyses of genomic data are simpler and more accurate using NetWalk than with some of the currently employed methods. We also present NetWalk analysis of microarray gene expression data from MCF7 cells exposed to different doses of doxorubicin, which reveals a switch-like pattern in the p53 regulated network in cell cycle arrest and apoptosis. Our analyses demonstrate the use of NetWalk as a valuable tool in generating high-confidence hypotheses from high-content genomic data.

[1]  Edgar Wingender,et al.  TRANSPATH®: a high quality database focused on signal transduction : Data integration in functional genomics and proteomics: application to biological pathways , 2004 .

[2]  Ron Shamir,et al.  Identifying functional modules using expression profiles and confidence-scored protein interactions , 2009, Bioinform..

[3]  John D. Storey,et al.  A network-based analysis of systemic inflammation in humans , 2005, Nature.

[4]  Bonnie Berger,et al.  RNAiCut: automated detection of significant genes from functional genomic screens , 2009, Nature Methods.

[5]  Jeremy Miller,et al.  Identifying disease-specific genes based on their topological significance in protein networks , 2009, BMC Syst. Biol..

[6]  G. Scambia,et al.  Assay for apoptosis using the mitochondrial probes, Rhodamine123 and 10-N-nonyl acridine orange , 2007, Nature Protocols.

[7]  Adam S. Kibel,et al.  Integrative molecular concept modeling of prostate cancer progression , 2007 .

[8]  Oksana Gavrilova,et al.  p53 Regulates Mitochondrial Respiration , 2006, Science.

[9]  Kam D. Dahlquist Using GenMAPP and MAPPFinder to View Microarray Data on Biological Pathways and Identify Global Trends in the Data , 2004, Current protocols in bioinformatics.

[10]  Michael L. Creech,et al.  Integration of biological networks and gene expression data using Cytoscape , 2007, Nature Protocols.

[11]  Dieter Müller,et al.  Pathway analysis tools and toxicogenomics reference databases for risk assessment. , 2008, Pharmacogenomics.

[12]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[13]  Wen-Lin Kuo,et al.  A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. , 2006, Cancer cell.

[14]  Obi L. Griffith,et al.  ORegAnno: an open-access community-driven resource for regulatory annotation , 2007, Nucleic Acids Res..

[15]  Ian M. Donaldson,et al.  BIND: THE BIOMOLECULAR INTERACTION DATABASE , 2001 .

[16]  Maria Victoria Schneider,et al.  MINT: a Molecular INTeraction database. , 2002, FEBS letters.

[17]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[18]  Gary D Bader,et al.  BIND--The Biomolecular Interaction Network Database. , 2001, Nucleic acids research.

[19]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[20]  C. Sander,et al.  Models from experiments: combinatorial drug perturbations of cancer cells , 2008, Molecular systems biology.

[21]  R. Shamir,et al.  Regulatory networks define phenotypic classes of human stem cell lines , 2008, Nature.

[22]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[23]  Tatiana Nikolskaya,et al.  Early prediction of drug metabolism and toxicity: systems biology approach and modeling. , 2004, Drug discovery today.

[24]  David J. Aldous,et al.  Lower bounds for covering times for reversible Markov chains and random walks on graphs , 1989 .

[25]  T. Nikolskaya,et al.  Biological networks and analysis of experimental data in drug discovery. , 2005, Drug discovery today.

[26]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Ron Edgar,et al.  Mining microarray data at NCBI's Gene Expression Omnibus (GEO)*. , 2006, Methods in molecular biology.

[28]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[29]  Sean Ekins,et al.  Pathway mapping tools for analysis of high content data. , 2007, Methods in molecular biology.

[30]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[31]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[32]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[33]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[34]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[35]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..