Identifying Gene Network Rewiring Using Robust Differential Graphical Model with Multivariate $t$t-Distributi

Identifying gene network rewiring under different biological conditions is important for understanding the mechanisms underlying complex diseases. Gaussian graphical models, which assume the data follow the multivariate normal distribution, are widely used to identify gene network rewiring. However, the normality assume often fails in reality since the data are contaminated by extreme outliers in general. In this study, we propose a new robust differential graphical model to identify gene network rewiring between two conditions based on the multivariate t-distribution. The multivariate t-distribution is more robust to outliers than the normal distribution since it has heavy tails and allows values far from the mean. A fused lasso penalty is used to borrow information across conditions to improve the results. We develop an expectation maximization algorithm to solve the optimization model. Experiment results on simulated data show that our method outperforms the state-of-the-art methods. Our method is also applied to identify gene network rewiring between luminal A and basal-like subtypes of breast cancer. Several key genes which drive gene network rewiring are discovered.

[1]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[2]  Raul H. C. Lopes,et al.  Pengaruh Latihan Small Sided Games 4 Lawan 4 Dengan Maksimal Tiga Sentuhan Terhadap Peningkatan VO2MAX Pada Siswa SSB Tunas Muda Bragang Klampis U-15 , 2022, Jurnal Ilmiah Mandala Education.

[3]  Quanquan Gu,et al.  Identifying gene regulatory network rewiring using latent differential graphical models , 2016, Nucleic acids research.

[4]  Peter Donnelly,et al.  A robust clustering algorithm for identifying problematic samples in genome-wide association studies , 2011, Bioinform..

[5]  D. Dai,et al.  Cancer Subtype Discovery and Biomarker Identification via a New Robust Network Clustering Algorithm , 2013, PloS one.

[6]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[7]  Fabrice Andre,et al.  Fibroblast growth factor receptor inhibitors as a cancer treatment: from a biologic rationale to medical perspectives. , 2013, Cancer discovery.

[8]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[9]  Larry A. Wasserman,et al.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models , 2010, NIPS.

[10]  Catherine L Nutt,et al.  Selection pressures of TP53 mutation and microenvironmental location influence epidermal growth factor receptor gene amplification in human glioblastomas. , 2003, Cancer research.

[11]  Kim-Anh Do,et al.  DINGO: differential network analysis in genomics , 2015, Bioinform..

[12]  Hong Yan,et al.  DiffGraph: an R package for identifying gene network rewiring using differential graphical models , 2018, Bioinform..

[13]  A. Fuente,et al.  From ‘differential expression’ to ‘differential networking’ – identification of dysfunctional regulatory networks in diseases , 2010 .

[14]  Daniel S. Himmelstein,et al.  Understanding multicellular function and disease with human tissue-specific networks , 2015, Nature Genetics.

[15]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[16]  Min Wu,et al.  Node-based learning of differential networks from multi-platform gene expression data. , 2017, Methods.

[17]  Ulf Leser,et al.  Comparative assessment of differential network analysis methods , 2016, Briefings Bioinform..

[18]  Hong Yan,et al.  Node-based differential network analysis in genomics , 2017, Comput. Biol. Chem..

[19]  Su-In Lee,et al.  Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..

[20]  Xiaohua Hu,et al.  Inferring Gene Network Rewiring by Combining Gene Expression and Gene Mutation Data , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Xiao-Fei Zhang,et al.  Determining minimum set of driver nodes in protein-protein interaction networks , 2015, BMC Bioinformatics.

[22]  Hong Yan,et al.  Incorporating prior information into differential network analysis using non‐paranormal graphical models , 2017, Bioinform..

[23]  Hong Yan,et al.  Differential network analysis from cross-platform gene expression data , 2016, Scientific Reports.

[24]  Yanyang Tu,et al.  Combined detection of Gab1 and Gab2 expression predicts clinical outcome of patients with glioma , 2014, Medical Oncology.

[25]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[26]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[27]  D. Govender,et al.  Gene of the month: KIT , 2015, Journal of Clinical Pathology.

[28]  Hong Yan,et al.  Joint Learning of Multiple Differential Networks With Latent Variables , 2019, IEEE Transactions on Cybernetics.

[29]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[30]  Susmita Datta,et al.  A statistical framework for differential network analysis from microarray data , 2010, BMC Bioinformatics.

[31]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[32]  T. Ideker,et al.  Integrative approaches for finding modular structure in biological networks , 2013, Nature Reviews Genetics.

[33]  Su-In Lee,et al.  Identifying Network Perturbation in Cancer , 2016, bioRxiv.

[34]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[35]  Andrea Califano,et al.  Rewiring makes the difference , 2011, Molecular systems biology.

[36]  Sourav Bandyopadhyay,et al.  Rewiring of Genetic Networks in Response to DNA Damage , 2010, Science.

[37]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[38]  V P Collins,et al.  Human glioblastomas with no alterations of the CDKN2A (p16INK4A, MTS1) and CDK4 genes have frequent mutations of the retinoblastoma gene. , 1996, Oncogene.

[39]  Nicholas J Wareham,et al.  IGF1 and IGFBP3 tagging polymorphisms are associated with circulating levels of IGF1, IGFBP3 and risk of breast cancer. , 2006, Human molecular genetics.

[40]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[41]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[42]  Ava Kwong,et al.  Circulating microRNAs as Specific Biomarkers for Breast Cancer Detection , 2013, PloS one.

[43]  Lincoln D. Stein,et al.  Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes , 2012, Nature.

[44]  Kevin Ryan,et al.  The alternative product from the human CDKN2A locus, p14ARF, participates in a regulatory feedback loop with p53 and MDM2 , 1998, The EMBO journal.

[45]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[46]  Le Ou-Yang,et al.  Identifying differential networks based on multi-platform gene expression data. , 2016, Molecular bioSystems.

[47]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[48]  Giovanni Montana,et al.  Model-Based Clustering with gene Ranking using penalized Mixtures of heavy-tailed Distributions , 2013, J. Bioinform. Comput. Biol..

[49]  Le Ou-Yang,et al.  Identifying Gene Network Rewiring by Integrating Gene Expression and Gene Network Data , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[50]  S. Schnitt,et al.  Classification and prognosis of invasive breast cancer: from morphology to molecular taxonomy , 2010, Modern Pathology.