CNet: a multi-omics approach to detecting clinically associated, combinatory genomic signatures

MOTIVATION Genome-wide multi-omics profiling of complex diseases provides valuable resources and opportunities to discover associations between various measures of genes and diseases. Currently, a pressing challenge is how to effectively detect functional genes associated with or causing phenotypic outcomes. We developed CNet to identify groups of genomic signatures whose combinatory effect is significantly associated with clinical and phenotypical outcomes. RESULTS CNet builds on a generalized sequential feedforward method, augmented by a down-sampling bootstrap strategy to reduce random hitchhiking signatures. It further applies a dynamic trimming procedure to remove relatively less informative signatures at every step. CNet can manage heterogeneous genomic signature profiles simultaneously and select the best signature to represent a specific gene. To deal with various forms of clinical and phenotypical measurements, we introduced four models to deal with continuous, categorical, and censored data. We tested CNet using drug-response data, multidimensional cancer genomics data, and genome-wide association study data for multiple traits. Our results demonstrated that in various scenarios, CNet could effectively identify signatures that are associated with the outcomes. In addition, we applied CNet to identify likely disease-causing chains involving somatic mutations, pathway activities, and patient outcomes. With appropriate setting, CNet can be applied in many biological conditions. AVAILABILITY CNet can be downloaded at https://github.com/bsml320/CNet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Yann Joly,et al.  Data Sharing in the Post-Genomic World: The Experience of the International Cancer Genome Consortium (ICGC) Data Access Compliance Office (DACO) , 2012, PLoS Comput. Biol..

[2]  Justin Guinney,et al.  GSVA: gene set variation analysis for microarray and RNA-Seq data , 2013, BMC Bioinformatics.

[3]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Jack Euesden,et al.  PRSice: Polygenic Risk Score software , 2014, Bioinform..

[5]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[6]  David Haussler,et al.  PARADIGM-SHIFT predicts the function of mutations in multiple cancers using pathway impact analysis , 2012, Bioinform..

[7]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[8]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[9]  Xintao Wei,et al.  Pathway Commons at Virtual Cell: use of pathway data for mathematical modeling , 2012, Bioinform..

[10]  Chunyu Liu,et al.  Genes with de novo mutations are shared by four neuropsychiatric disorders discovered from NPdenovo database , 2016, Molecular Psychiatry.

[11]  Li Jin,et al.  A polymorphism near osteoprotegerin gene confer risk of obesity in Uyghurs , 2010, Endocrine.

[12]  A. Uitterlinden,et al.  Multistage genome-wide association meta-analyses identified two new loci for bone mineral density. , 2014, Human molecular genetics.

[13]  A. D’Andrea,et al.  Fanconi anemia pathway , 2017, Current Biology.

[14]  Eléonore Toufektchan,et al.  The Guardian of the Genome Revisited: p53 Downregulates Genes Required for Telomere Maintenance, DNA Repair, and Centromere Structure , 2018, Cancers.

[15]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[16]  Gabriela Alexe,et al.  Characterizing genomic alterations in cancer by complementary functional associations , 2016, Nature Biotechnology.

[17]  Judy H. Cho,et al.  Transcriptional Risk Scores link GWAS to eQTL and Predict Complications in Crohn's Disease , 2017, Nature Genetics.

[18]  Kai Wang,et al.  wANNOVAR: annotating genetic variants for personal genomes via the web , 2012, Journal of Medical Genetics.

[19]  Qiang Yu,et al.  Chromosome 1q21.3 amplification is a trackable biomarker and actionable target for breast cancer recurrence , 2017, Nature Medicine.

[20]  Kara Dolinski,et al.  The BioGRID interaction database: 2017 update , 2016, Nucleic Acids Res..

[21]  O. Yersal,et al.  Biological subtypes of breast cancer: Prognostic and therapeutic implications. , 2014, World journal of clinical oncology.

[22]  A. Pizzuti,et al.  Single nucleotide polymorphisms in the promoter regions of Foxp3 and ICOSLG genes are associated with Alopecia Areata , 2014, Clinical and Experimental Medicine.

[23]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[24]  E. Vassos,et al.  Prospects for using risk scores in polygenic medicine , 2017, Genome Medicine.

[25]  Daniel Marbach,et al.  Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics , 2016, PLoS Comput. Biol..

[26]  Zhongming Zhao,et al.  Investigation of multi-trait associations using pathway-based analysis of GWAS summary statistics , 2019, BMC Genomics.

[27]  Milind B. Suraokar,et al.  A 12-Gene Set Predicts Survival Benefits from Adjuvant Chemotherapy in Non–Small Cell Lung Cancer Patients , 2013, Clinical Cancer Research.

[28]  Joseph K. Pickrell,et al.  Detection and interpretation of shared genetic influences on 42 human traits , 2015, Nature Genetics.

[29]  G. Getz,et al.  GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers , 2011, Genome Biology.