Testing for pathway (in)activation by using Gaussian graphical models

Genes work together in sets known as pathways to contribute to cellular processes, such as apoptosis and cell proliferation. Pathway activation, or inactivation, may be reflected in varying partial correlations between the levels of expression of the genes that constitute the pathway. Here we present a method to identify pathway activation status from two-sample studies. By modelling the levels of expression in each group by using a Gaussian graphical model, their partial correlations are proportional, differing by a common multiplier that reflects the activation status. We estimate model parameters by means of penalized maximum likelihood and evaluate the estimation procedure performance in a simulation study. A permutation scheme to test for pathway activation status is proposed. A reanalysis of publicly available data on the hedgehog pathway in normal and cancer prostate tissue shows its activation in the disease group: an indication that this pathway is involved in oncogenesis. Extensive diagnostics employed in the reanalysis complete the methodology proposed.

[1]  W. Gautschi,et al.  An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form , 1986 .

[2]  Charles J. Geyer,et al.  Ridge Fusion in Statistical Learning , 2013, ArXiv.

[3]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[4]  Aad van der Vaart,et al.  A Test for Partial Differential Expression , 2008 .

[5]  Tianxi Cai,et al.  Testing Differential Networks with Applications to Detecting Gene-by-Gene Interactions. , 2015, Biometrika.

[6]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[7]  S. Horvath,et al.  Weighted gene coexpression network analysis strategies applied to mouse weight , 2007, Mammalian Genome.

[8]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[9]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[10]  T. Tony Cai,et al.  Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures , 2017 .

[11]  Yin Xia Testing and support recovery of multiple high-dimensional covariance matrices with false discovery rate control , 2017 .

[12]  Wessel N. van Wieringen,et al.  Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes , 2015, J. Mach. Learn. Res..

[13]  A. W. Vaart,et al.  Transcriptomic Heterogeneity in Cancer as a Consequence of Dysregulation of the Gene–Gene Interaction Network , 2015 .

[14]  M. Rubin,et al.  Genome-wide DNA methylation events in TMPRSS2-ERG fusion-negative prostate cancers implicate an EZH2-dependent mechanism with miR-26a hypermethylation. , 2012, Cancer discovery.

[15]  Marie Evangelista,et al.  The Hedgehog Signaling Pathway in Cancer , 2006, Clinical Cancer Research.

[16]  Aad van der Vaart,et al.  Statistical analysis of the cancer cell's molecular entropy using high-throughput data , 2011, Bioinform..

[17]  Tun-Wen Pai,et al.  Inhibition of the interactions between eosinophil cationic protein and airway epithelial cells by traditional Chinese herbs , 2010, BMC Systems Biology.

[18]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[19]  Wessel N. van Wieringen,et al.  Ridge estimation of inverse covariance matrices from high-dimensional data , 2014, Comput. Stat. Data Anal..

[20]  P. Müller,et al.  Bayesian Graphical Models for Differential Pathways , 2016 .

[21]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[22]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[23]  Takumi Saegusa,et al.  Joint Estimation of Precision Matrices in Heterogeneous Populations. , 2016, Electronic journal of statistics.

[24]  Antoine Souloumiac,et al.  Jacobi Angles for Simultaneous Diagonalization , 1996, SIAM J. Matrix Anal. Appl..

[25]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[26]  James R. Schott,et al.  A test for the equality of covariance matrices when the dimension is large relative to the sample sizes , 2007, Comput. Stat. Data Anal..

[27]  Seung-Jean Kim,et al.  Condition‐number‐regularized covariance estimation , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[28]  Frank McCormick,et al.  Activation of the hedgehog pathway in advanced prostate cancer , 2004, Molecular Cancer.

[29]  M. E. Ruaro,et al.  The growth arrest-specific gene, gas1, is involved in growth suppression , 1992, Cell.

[30]  Sach Mukherjee,et al.  Multivariate gene-set testing based on graphical models. , 2015, Biostatistics.

[31]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[32]  N. Campbell Robust Procedures in Multivariate Analysis I: Robust Covariance Estimation , 1980 .

[33]  Kim-Anh Do,et al.  DINGO: differential network analysis in genomics , 2015, Bioinform..

[34]  Christine B Peterson,et al.  Bayesian Inference of Multiple Gaussian Graphical Models , 2015, Journal of the American Statistical Association.

[35]  P. Bryant,et al.  Somatic Mutations and Altered Expression of the Candidate Tumor Suppressors CSNK1ε, DLG1, and EDD/hHYD in Mammary Ductal Carcinoma , 2004, Cancer Research.