Biological network inference using low order partial correlation.

Biological network inference is a major challenge in systems biology. Traditional correlation-based network analysis results in too many spurious edges since correlation cannot distinguish between direct and indirect associations. To address this issue, Gaussian graphical models (GGM) were proposed and have been widely used. Though they can significantly reduce the number of spurious edges, GGM are insufficient to uncover a network structure faithfully due to the fact that they only consider the full order partial correlation. Moreover, when the number of samples is smaller than the number of variables, further technique based on sparse regularization needs to be incorporated into GGM to solve the singular covariance inversion problem. In this paper, we propose an efficient and mathematically solid algorithm that infers biological networks by computing low order partial correlation (LOPC) up to the second order. The bias introduced by the low order constraint is minimal compared to the more reliable approximation of the network structure achieved. In addition, the algorithm is suitable for a dataset with small sample size but large number of variables. Simulation results show that LOPC yields far less spurious edges and works well under various conditions commonly seen in practice. The application to a real metabolomics dataset further validates the performance of LOPC and suggests its potential power in detecting novel biomarkers for complex disease.

[1]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[2]  Bill Shipley,et al.  Cause and Correlation in Biology: A User''s Guide to Path Analysis , 2016 .

[3]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  R. Fisher FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFIENTS IN SAMPLES FROM AN INDEFINITELY LARGE POPU;ATION , 1915 .

[5]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[6]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[7]  A. .. Lawrance On Conditional and Partial Correlation , 1976 .

[8]  Mahlet G Tadesse,et al.  LC-MS based serum metabolomics for identification of hepatocellular carcinoma biomarkers in Egyptian cohort. , 2012, Journal of proteome research.

[9]  D. Edwards Introduction to graphical modelling , 1995 .

[10]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[11]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[12]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[13]  E. Kimes,et al.  Evaluation of Vancomycin TDM Strategies: Prediction and Prevention of Kidney Injuries Based on Vancomycin TDM Results , 2023, Journal of Korean medical science.

[14]  Antonio Reverter,et al.  Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks , 2008, Bioinform..

[15]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[16]  Elmer S. West From the U. S. A. , 1965 .

[17]  Luis M. de Campos,et al.  A new approach for learning belief networks using independence criteria , 2000, Int. J. Approx. Reason..

[18]  Ralf Steuer,et al.  Review: On the analysis and interpretation of correlations in metabolomic data , 2006, Briefings Bioinform..

[19]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[20]  Michael I. Jordan Graphical Models , 2003 .

[21]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[22]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[23]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[24]  Graham J. Wills,et al.  Introduction to graphical modelling , 1995 .

[25]  J. Cavanaugh Biostatistics , 2005, Definitions.

[26]  R. Shibata,et al.  PARTIAL CORRELATION AND CONDITIONAL CORRELATION AS MEASURES OF CONDITIONAL INDEPENDENCE , 2004 .