JDINAC: joint density-based non-parametric differential interaction network analysis and classification using high-dimensional sparse omics data

Motivation: A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. Results: We propose a new Joint density based non‐parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non‐linear relations between molecular interactions using high‐dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state‐of‐the‐art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high‐dimensional sparse omics data. Availability and implementation: R scripts available at https://github.com/jijiadong/JDINAC Contact: lxie@iscb.org Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  G. Tzivion,et al.  FoxO transcription factors; Regulation by AKT and 14-3-3 proteins. , 2011, Biochimica et biophysica acta.

[2]  A. Brunet,et al.  FOXO transcription factors , 2007, Current Biology.

[3]  T. Oskarsson,et al.  Extracellular matrix components in breast cancer progression and metastasis. , 2013, Breast.

[4]  Diego di Bernardo,et al.  Differential network analysis for the identification of condition-specific pathway activity and regulation , 2013, Bioinform..

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  M. Kondo,et al.  The expression and localization of fibroblast growth factor-1 (FGF-1) and FGF receptor-1 (FGFR-1) in human breast cancer. , 1998, Clinical immunology and immunopathology.

[7]  Peter Gunning,et al.  Tropomyosin regulates cell migration during skin wound healing. , 2013, The Journal of investigative dermatology.

[8]  Michael Watson,et al.  CoXpress: differential co-expression in gene expression data , 2006, BMC Bioinformatics.

[9]  Robert Clarke,et al.  Differential dependency network analysis to identify condition-specific topological changes in biological networks , 2009, Bioinform..

[10]  Xiaoshuai Zhang,et al.  A powerful score-based statistical test for group difference in weighted biological networks , 2016, BMC Bioinformatics.

[11]  Li Li,et al.  Construction of a recombinant human FGF1 expression vector for mammary gland-specific expression in human breast cancer cells , 2011, Molecular and Cellular Biochemistry.

[12]  Michael Griffin,et al.  Gene co-expression network topology provides a framework for molecular characterization of cellular state , 2004, Bioinform..

[13]  Raquel Soares,et al.  Elucidating progesterone effects in breast cancer: Cross talk with PDGF signaling pathway in smooth muscle cell , 2007, Journal of cellular biochemistry.

[14]  W. Fu,et al.  Basic fibroblast growth factor stimulates fibronectin expression through phospholipase C gamma, protein kinase C alpha, c-Src, NF-kappaB, and p300 pathway in osteoblasts. , 2007, Journal of cellular physiology.

[15]  Pete E. Pascuzzi,et al.  The Antitumorigenic Function of EGFR in Metastatic Breast Cancer is Regulated by Expression of Mig6 , 2015, Neoplasia.

[16]  Antonio Reverter,et al.  A Differential Wiring Analysis of Expression Data Correctly Identifies the Gene Containing the Causal Mutation , 2009, PLoS Comput. Biol..

[17]  D. Tindall,et al.  FOXOs, cancer and regulation of apoptosis , 2008, Oncogene.

[18]  Matej Oresic,et al.  Systematic construction of gene coexpression networks with applications to human T helper cell differentiation process , 2007, Bioinform..

[19]  Y. Moreau,et al.  Finding the targets of a drug by integration of gene expression data with a protein interaction network. , 2013, Molecular bioSystems.

[20]  Rainer Breitling,et al.  DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules , 2010, BMC Bioinformatics.

[21]  S. Weitzman,et al.  Expression of hemidesmosomes and component proteins is lost by invasive breast cancer cells. , 1995, The American journal of pathology.

[22]  Yang Feng,et al.  Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification , 2013, Journal of the American Statistical Association.

[23]  Hui Yu,et al.  Bioinformatics Applications Note Gene Expression Dcgl: an R Package for Identifying Differentially Coexpressed Genes and Links from Gene Expression Microarray Data , 2022 .

[24]  Jing Xu,et al.  Detection for pathway effect contributing to disease in systems epidemiology with a case–control design , 2015, BMJ Open.

[25]  Philip E. Bourne,et al.  Developing multi-target therapeutics to fine-tune the evolutionary dynamics of the cancer ecosystem , 2015, Front. Pharmacol..

[26]  S. Kumar,et al.  Prognostic significance of TGF beta 1 and TGF beta 3 in human breast carcinoma. , 2000, Anticancer research.

[27]  Peng Qiu,et al.  TCGA-Assembler: open-source software for retrieving and processing TCGA data , 2014, Nature Methods.

[28]  Pingping Wang,et al.  Discriminant analysis on high dimensional Gaussian copula model , 2016 .

[29]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[30]  Andrea Califano,et al.  Rewiring makes the difference , 2011, Molecular systems biology.

[31]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[32]  B. White,et al.  Coordinate Regulation of FOXO1 by miR-27a, miR-96, and miR-182 in Breast Cancer Cells , 2009, The Journal of Biological Chemistry.

[33]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[34]  A. G. de la Fuente From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases. , 2010, Trends in genetics : TIG.

[35]  Antonio Reverter,et al.  Simultaneous identification of differential gene expression and connectivity in inflammation, adipogenesis and cancer , 2006, Bioinform..

[36]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[37]  Giovanni Montana,et al.  Differential analysis of biological networks , 2015, BMC Bioinformatics.

[38]  A. Fukushima DiffCorr: an R package to analyze and visualize differential correlations in biological networks. , 2013, Gene.

[39]  Kim-Anh Do,et al.  DINGO: differential network analysis in genomics , 2015, Bioinform..

[40]  Chih-Hsin Tang,et al.  Basic fibroblast growth factor stimulates fibronectin expression through phospholipase C γ, protein kinase C α, c‐Src, NF‐κB, and p300 pathway in osteoblasts , 2007 .

[41]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[42]  Yaling Yin,et al.  Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data , 2013, BioMed research international.

[43]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[44]  L. Brass,et al.  Protease-activated receptors (PAR1 and PAR2) contribute to tumor cell motility and metastasis. , 2004, Molecular cancer research : MCR.

[45]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[46]  Nitai D. Mukhopadhyay,et al.  An inferential framework for biological network hypothesis tests , 2013, BMC Bioinformatics.

[47]  K. Zhang,et al.  FZD7 has a critical role in cell proliferation in triple negative breast cancer , 2011, Oncogene.

[48]  Sourav Bandyopadhyay,et al.  Rewiring of Genetic Networks in Response to DNA Damage , 2010, Science.

[49]  David S. Lapointe,et al.  FGF2-induced effects on transcriptome associated with regeneration competence in adult human fibroblasts , 2013, BMC Genomics.