Reconstruction of Metabolic Association Networks Using High-throughput Mass Spectrometry Data

Graphical Gaussian model (GGM) has been widely used in genomics and proteomics to infer biological association networks, but the relative performances of various GGM-based methods are still unclear in metabolomics. The association between two nodes of GGM is calculated by partial correlation as a measure of conditional independence. To estimate the partial correlations with small sample size and large variables, two approaches have been introduced, which are arithmetic mean-based and geometric mean-based methods. In this study, we investigated the effects of these two approaches on constructing association metabolite networks and then compared their performances using partial least squares regression and principal component regression along with shrinkage covariance estimate as a reference. These approaches then are applied to simulated data and real metabolomics data.

[1]  Bruce D. Hammock,et al.  Metabolomics: building on a century of biochemistry to guide human health , 2005, Metabolomics.

[2]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[3]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[4]  Jun Zhang,et al.  MetSign: a computational platform for high-resolution mass spectrometry-based metabolomics. , 2011, Analytical chemistry.

[5]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[6]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[7]  P. Rousseeuw,et al.  Wiley Series in Probability and Mathematical Statistics , 2005 .

[8]  Vasyl Pihur,et al.  Reconstruction of genetic association networks from microarray data: a partial least squares approach , 2008, Bioinform..

[9]  Fabian J. Theis,et al.  Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data , 2011, BMC Systems Biology.

[10]  A. Höskuldsson PLS regression methods , 1988 .

[11]  Anne-Laure Boulesteix,et al.  Regularized estimation of large-scale gene association networks using graphical Gaussian models , 2009, BMC Bioinformatics.

[12]  Korbinian Strimmer,et al.  fdrtool: a versatile R package for estimating local and tail area-based false discovery rates , 2008, Bioinform..

[13]  D. Kliebenstein,et al.  The Complex Genetic Architecture of the Metabolome , 2010, PLoS genetics.

[14]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .