Network neighborhood analysis with the multi-node topological overlap measure

MOTIVATION The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topological overlap measure. Since the use of multiple nodes in the seed set may lead to more informative neighborhoods, it can be advantageous to define multi-node similarity measures. RESULTS The pairwise topological overlap measure is generalized to multiple network nodes and subsequently used in a recursive neighborhood construction method. A local permutation scheme is used to determine the neighborhood size. Using four network applications and a simulated example, we provide empirical evidence that the resulting neighborhoods are biologically meaningful, e.g. we use neighborhood analysis to identify brain cancer related genes. AVAILABILITY An executable Windows program and tutorial for multi-node topological overlap measure (MTOM) based analysis can be downloaded from the webpage (http://www.genetics.ucla.edu/labs/horvath/MTOM/).

[1]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[2]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[3]  Mike Tyers,et al.  The GRID: The General Repository for Interaction Datasets , 2003, Genome Biology.

[4]  Adam Godzik,et al.  Comparative analysis of protein domain organization. , 2004, Genome research.

[5]  Y. Leea,et al.  Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target , 2006 .

[6]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[7]  Edward R. Dougherty,et al.  Inferring gene regulatory networks from time series data using the minimum description length principle , 2006, Bioinform..

[8]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Vladimir Pavlovic,et al.  Protein classification using probabilistic chain graphs and the Gene Ontology structure , 2006, Bioinform..

[10]  Mark Gerstein,et al.  Information assessment on predicting protein-protein interactions , 2004, BMC Bioinformatics.

[11]  John M. Walker,et al.  Comparative Genomics , 2007, Methods In Molecular Biology™.

[12]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[13]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[14]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[15]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[16]  Mong-Li Lee,et al.  Increasing confidence of protein interactomes using network topological metrics , 2006, Bioinform..

[17]  Ting Chen,et al.  Mapping gene ontology to proteins based on protein-protein interaction data , 2004, Bioinform..

[18]  Aiqing He,et al.  Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids , 2006, Proceedings of the National Academy of Sciences.

[19]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[20]  S. Lerbs-Mache,et al.  Sub-plastidial localization of two different phage-type RNA polymerases in spinach chloroplasts , 2006, Nucleic acids research.

[21]  S. Horvath,et al.  Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target , 2006, Proceedings of the National Academy of Sciences.

[22]  Limsoon Wong,et al.  Exploiting indirect neighbours and topological weight to predict protein function from protein--protein interactions , 2006 .

[23]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[24]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[25]  Matthew W. Hahn,et al.  Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. , 2005, Molecular biology and evolution.

[26]  Hawoong Jeong,et al.  Prediction of Protein Essentiality Based on Genomic Data , 2002, Complexus.

[27]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  S. Horvath,et al.  Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks , 2006, BMC Genomics.

[29]  Hongyu Zhao,et al.  Are scale-free networks robust to measurement errors? , 2005, BMC Bioinformatics.