A new insight into underlying disease mechanism through semi-parametric latent differential network model

Background In genomic studies, to investigate how the structure of a genetic network differs between two experiment conditions is a very interesting but challenging problem, especially in high-dimensional setting. Existing literatures mostly focus on differential network modelling for continuous data. However, in real application, we may encounter discrete data or mixed data, which urges us to propose a unified differential network modelling for various data types. Results We propose a unified latent Gaussian copula differential network model which provides deeper understanding of the unknown mechanism than that among the observed variables. Adaptive rank-based estimation approaches are proposed with the assumption that the true differential network is sparse. The adaptive estimation approaches do not require precision matrices to be sparse, and thus can allow the individual networks to contain hub nodes. Theoretical analysis shows that the proposed methods achieve the same parametric convergence rate for both the difference of the precision matrices estimation and differential structure recovery, which means that the extra modeling flexibility comes at almost no cost of statistical efficiency. Besides theoretical analysis, thorough numerical simulations are conducted to compare the empirical performance of the proposed methods with some other state-of-the-art methods. The result shows that the proposed methods work quite well for various data types. The proposed method is then applied on gene expression data associated with lung cancer to illustrate its empirical usefulness. Conclusions The proposed latent variable differential network models allows for various data-types and thus are more flexible, which also provide deeper understanding of the unknown mechanism than that among the observed variables. Theoretical analysis, numerical simulation and real application all demonstrate the great advantages of the latent differential network modelling and thus are highly recommended.

[1]  H. Zou,et al.  High dimensional semiparametric latent graphical model for mixed data , 2014, 1404.7236.

[2]  Matthew N. McCall,et al.  Thawing Frozen Robust Multi-array Analysis (fRMA) , 2011, BMC Bioinformatics.

[3]  Yue Zhao,et al.  Dishevelled family proteins are expressed in non-small cell lung cancer and function differentially on tumor progression. , 2008, Lung cancer.

[4]  Tianxi Cai,et al.  Testing Differential Networks with Applications to Detecting Gene-by-Gene Interactions. , 2015, Biometrika.

[5]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[6]  David Sidransky,et al.  Adenomatous polyposis coli gene promoter hypermethylation in non-small cell lung cancer is associated with survival , 2001, Oncogene.

[7]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[8]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[9]  J. Pongrácz,et al.  WNT signaling – lung cancer is no exception , 2017, Respiratory Research.

[10]  Larry A. Wasserman,et al.  High Dimensional Semiparametric Gaussian Copula Graphical Models. , 2012, ICML 2012.

[11]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[12]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[13]  Wen Huang,et al.  MTML-msBayes: Approximate Bayesian comparative phylogeographic inference from multiple taxa and multiple loci with rate heterogeneity , 2011, BMC Bioinformatics.

[14]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[15]  D. Stewart,et al.  Review Wnt Signaling Pathway in Non–small Cell Lung Cancer Overview of the Wnt Canonical (β-catenin) and Noncanonical Signaling Pathways , 2022 .

[16]  Yang Feng,et al.  JDINAC: joint density-based non-parametric differential interaction network analysis and classification using high-dimensional sparse omics data , 2017, bioRxiv.

[17]  Hiroyuki Takahashi,et al.  Tobacco smoke promotes lung tumorigenesis by triggering IKKbeta- and JNK1-dependent inflammation. , 2010, Cancer cell.

[18]  D. Koller,et al.  From signatures to models: understanding cancer using microarrays , 2005, Nature Genetics.

[19]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[20]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[21]  Kwok-Kin Wong,et al.  β-catenin contributes to lung tumor development induced by EGFR mutations. , 2014, Cancer research.

[22]  Ruibin Xi,et al.  Differential network analysis via lasso penalized D-trace loss , 2015, 1511.09188.

[23]  Tao Zhang,et al.  A novel chi‐square statistic for detecting group differences between pathways in systems epidemiology , 2016, Statistics in medicine.

[24]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[25]  A. G. de la Fuente From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases. , 2010, Trends in genetics : TIG.

[26]  Dong Sun Kim,et al.  Wif1 hypermethylation as unfavorable prognosis of non-small cell lung cancers with EGFR mutation , 2013, Molecules and cells.

[27]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[28]  Qian Tao,et al.  Epigenetic disruption of the WNT/ß-catenin signaling pathway in human cancers , 2009, Epigenetics.

[29]  Hiroaki Sakurai,et al.  RAC1 inhibition as a therapeutic target for gefitinib-resistant non-small-cell lung cancer , 2014, Cancer science.

[30]  Robert A. Winn,et al.  Restoration of Wnt-7a Expression Reverses Non-small Cell Lung Cancer Cellular Transformation through Frizzled-9-mediated Growth Inhibition and Promotion of Cell Differentiation* , 2005, Journal of Biological Chemistry.

[31]  Xiaoshuai Zhang,et al.  A powerful score-based statistical test for group difference in weighted biological networks , 2016, BMC Bioinformatics.

[32]  Yong He,et al.  Joint estimation of multiple high‐dimensional Gaussian copula graphical models , 2017 .

[33]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[34]  C. O. A. D. P. R. M. A. E. Stimation Covariate Adjusted Precision Matrix Estimation with an Application in Genetical Genomics , 2011 .

[35]  Jing Xu,et al.  A powerful weighted statistic for detecting group differences of directed biological networks , 2016, Scientific Reports.

[36]  H. Zou,et al.  Sparse precision matrix estimation via lasso penalized D-trace loss , 2014 .

[37]  Hongzhe Li,et al.  Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. , 2006, Biostatistics.

[38]  Jing Xu,et al.  Detection for pathway effect contributing to disease in systems epidemiology with a case–control design , 2015, BMJ Open.

[39]  Weidong Liu Structural similarity and difference testing on multiple sparse Gaussian graphical models , 2017 .

[40]  Cheng-Long Huang,et al.  Overexpression of matrix metalloproteinase-7 (MMP-7) correlates with tumor proliferation, and a poor prognosis in non-small cell lung cancer. , 2007, Lung cancer.

[41]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[42]  H. Zou,et al.  Regularized rank-based estimation of high-dimensional nonparanormal graphical models , 2012, 1302.3082.

[43]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[44]  Biao He,et al.  Wnt signaling in lung cancer. , 2005, Cancer letters.

[45]  Quanquan Gu,et al.  Identifying gene regulatory network rewiring using latent differential graphical models , 2016, Nucleic acids research.

[46]  Yong He,et al.  High dimensional Gaussian copula graphical model with FDR control , 2017, Comput. Stat. Data Anal..