MiRNA-gene network embedding for predicting cancer driver genes.

The development and progression of cancer arise due to the accumulation of mutations in driver genes. Correctly identifying the driver genes that lead to cancer development can significantly assist the drug design, cancer diagnosis and treatment. Most computer methods detect cancer drivers based on gene-gene networks by assuming that driver genes tend to work together, form protein complexes and enrich pathways. However, they ignore that microribonucleic acid (RNAs; miRNAs) regulate the expressions of their targeted genes and are related to human diseases. In this work, we propose a graph convolution network (GCN) approach called GM-GCN to identify the cancer driver genes based on a gene-miRNA network. First, we constructed a gene-miRNA network, where the nodes are miRNAs and their targeted genes. The edges connecting miRNA and genes indicate the regulatory relationship between miRNAs and genes. We prepared initial attributes for miRNA and genes according to their biological properties and used a GCN model to learn the gene feature representations in the network by aggregating the features of their neighboring miRNA nodes. And then, the learned features were passed through a 1D convolution module for feature dimensionality change. We employed the learned and original gene features to optimize model parameters. Finally, the gene features learned from the network and the initial input gene features were fed into a logistic regression model to predict whether a gene is a driver gene. We applied our model and state-of-the-art methods to predict cancer drivers for pan-cancer and individual cancer types. Experimental results show that our model performs well in terms of the area under the receiver operating characteristic curve and the area under the precision-recall curve compared to state-of-the-art methods that work on gene networks. The GM-GCN is freely available via https://github.com/weiba/GM-GCN.

[1]  Q. Zou,et al.  Deep learning models for disease-associated circRNA prediction: a review , 2022, Briefings Bioinform..

[2]  Wei Peng,et al.  Predicting cancer drug response using parallel heterogeneous graph convolutional networks with neighborhood interactions , 2022, Bioinform..

[3]  Wei Peng,et al.  Predicting miRNA-disease associations from miRNA-gene-disease heterogeneous network with multi-relational graph convolutional network model. , 2022, IEEE/ACM transactions on computational biology and bioinformatics.

[4]  Zhongsheng Sun,et al.  Comprehensive evaluation of computational methods for predicting cancer driver genes , 2022, Briefings Bioinform..

[5]  Jijun Tang,et al.  Two-stage-vote ensemble framework based on integration of mutation data and gene interaction network for uncovering driver genes , 2021, Briefings Bioinform..

[6]  Wei Peng,et al.  Improving cancer driver gene identification using multi-task learning on graph convolutional network , 2021, Briefings Bioinform..

[7]  Wei Peng,et al.  Predicting Drug Response Based on Multi-Omics Fusion and Graph Convolution , 2021, IEEE Journal of Biomedical and Health Informatics.

[8]  Q. Zou,et al.  Molecular design in drug discovery: a comprehensive review of deep generative models , 2021, Briefings Bioinform..

[9]  Jijun Tang,et al.  A systematic view of computational methods for identifying driver genes based on somatic mutation data. , 2021, Briefings in functional genomics.

[10]  Wei Peng,et al.  GANLDA: Graph attention network for lncRNA-disease associations prediction , 2021, Neurocomputing.

[11]  Wei Peng,et al.  Predicting miRNA-Disease Association Based on Modularity Preserving Heterogeneous Network Embedding , 2021, Frontiers in Cell and Developmental Biology.

[12]  Roman Schulte-Sasse,et al.  Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms , 2021, Nature Machine Intelligence.

[13]  Jianxin Wang,et al.  Identifying and ranking potential cancer drivers using representation learning on attributed network. , 2020, Methods.

[14]  Wei Peng,et al.  An Entropy-Based Method for Identifying Mutual Exclusive Driver Genes in Cancer , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Falin Chen,et al.  Serum exosomal miR‐7977 as a novel biomarker for lung adenocarcinoma , 2020, Journal of cellular biochemistry.

[16]  Jianxin Wang,et al.  Identifying driver genes involving gene dysregulated expression, tissue-specific expression and gene-gene network , 2019, BMC Medical Genomics.

[17]  Lincoln D Stein,et al.  The International Cancer Genome Consortium Data Portal , 2019, Nature Biotechnology.

[18]  Xiujuan Lei,et al.  deepDriver: Predicting Cancer Driver Genes Based on Somatic Mutations Using Deep Convolutional Neural Networks , 2019, Front. Genet..

[19]  Xiangxiang Zeng,et al.  Prediction of potential disease-associated microRNAs using structural perturbation method , 2017, bioRxiv.

[20]  Liguo Zhang,et al.  Unifying cancer and normal RNA sequencing data from different sources , 2018, Scientific Data.

[21]  A. Davies,et al.  A comprehensive characterisation of the metabolic profile of varicose veins; implications in elaborating plausible cellular pathways for disease pathogenesis , 2017, Scientific Reports.

[22]  F. Supek,et al.  MUFFINN: cancer gene discovery via network analysis of somatic mutation data , 2016, Genome Biology.

[23]  Q. Zou,et al.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks , 2016, Briefings Bioinform..

[24]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[25]  Gary D Bader,et al.  Systematic analysis of somatic mutations impacting gene expression in 12 tumour types , 2015, Nature Communications.

[26]  H. Dweep,et al.  miRWalk2.0: a comprehensive atlas of microRNA-target interactions , 2015, Nature Methods.

[27]  Benjamin J. Raphael,et al.  Pan-Cancer Network Analysis Identifies Combinations of Rare Somatic Mutations across Pathways and Protein Complexes , 2014, Nature Genetics.

[28]  S. Gabriel,et al.  Discovery and saturation analysis of cancer genes across 21 tumor types , 2014, Nature.

[29]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[30]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[31]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[32]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[33]  Matthew B. Callaway,et al.  MuSiC: Identifying mutational significance in cancer genomes , 2012, Genome research.

[34]  C. Cole,et al.  COSMIC: the catalogue of somatic mutations in cancer , 2011, Genome Biology.

[35]  C. Croce miRNAs in the spotlight: Understanding cancer gene dependency , 2011, Nature Medicine.

[36]  Ralf Herwig,et al.  ConsensusPathDB: toward a more complete picture of cell biology , 2010, Nucleic Acids Res..

[37]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.