Quantifying Direct Dependencies in Biological Networks by Multiscale Association Analysis

Partial correlation (PC) or conditional mutual information (CMI) is widely used in detecting direct dependencies between the observed variables in biological networks by eliminating indirect correlations/associations, but it fails whenever there are some strong correlations in a network. In this paper, we theoretically develop a multiscale association analysis to overcome this flaw. We propose a new measure, partial association (PA), based on the multiscale conditional mutual information. We show that linear PA and nonlinear PA have clear advantages over PC and CMI from both theoretical and computational aspects. Both simulated models and real omics datasets demonstrate that PA is superior to PC and CMI in terms of accuracy, and is a powerful tool to identify the direct associations or reconstruct molecular networks based on the observed data. Survival and functional analyses of the hub genes in the gene networks reconstructed from TCGA data for different cancers also validated the effectiveness of our method.

[1]  Jürgen Kurths,et al.  Escaping the curse of dimensionality in estimating multivariate transfer entropy. , 2012, Physical review letters.

[2]  Tian Zheng,et al.  Inference of Regulatory Gene Interactions from Expression Data Using Three‐Way Mutual Information , 2009, Annals of the New York Academy of Sciences.

[3]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[4]  B. Frey,et al.  Network cleanup , 2013, Nature Biotechnology.

[5]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[6]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[7]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[8]  Luonan Chen,et al.  Part mutual information for quantifying direct associations in networks , 2016, Proceedings of the National Academy of Sciences.

[9]  A. Barabasi,et al.  Network link prediction by global silencing of indirect correlations , 2013, Nature Biotechnology.

[10]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[11]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[12]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[13]  A. Martínez-Torteya,et al.  SurvExpress: An Online Biomarker Validation Tool and Database for Cancer Gene Expression Data Using Survival Analysis , 2013, PloS one.

[14]  Xing-Ming Zhao,et al.  Identifying disease genes and module biomarkers by differential interactions , 2012, J. Am. Medical Informatics Assoc..

[15]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[16]  Xing-Ming Zhao,et al.  Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information , 2012, Bioinform..

[17]  K. Aihara,et al.  Personalized characterization of diseases using sample-specific networks , 2016, bioRxiv.

[18]  Jeanne M O Eloundou-Mbebi,et al.  Gene regulatory network inference using fused LASSO on multiple data sets , 2016, Scientific Reports.

[19]  Xingming Zhao,et al.  Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks , 2014, Nucleic acids research.

[20]  Muriel Médard,et al.  Network deconvolution as a general method to distinguish direct dependencies in networks , 2013, Nature Biotechnology.

[21]  Mingming Jia,et al.  COSMIC: somatic cancer genetics at high-resolution , 2016, Nucleic Acids Res..