Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers

MOTIVATION Synthetic lethality (SL) is a promising form of gene interaction for cancer therapy, as it is able to identify specific genes to target at cancer cells without disrupting normal cells. As high-throughput wet-lab settings are often costly and face various challenges, computational approaches have become a practical complement. In particular, predicting SLs can be formulated as a link prediction task on a graph of interacting genes. Although matrix factorization techniques have been widely adopted in link prediction, they focus on mapping genes to latent representations in isolation, without aggregating information from neighboring genes. Graph convolutional networks (GCN) can capture such neighborhood dependency in a graph. However, it is still challenging to apply GCN for SL prediction as SL interactions are extremely sparse, which is more likely to cause overfitting. RESULTS In this paper, we propose a novel Dual-Dropout GCN (DDGCN) for learning more robust gene representations for SL prediction. We employ both coarse-grained node dropout and fine-grained edge dropout to address the issue that standard dropout in vanilla GCN is often inadequate in reducing overfitting on sparse graphs. In particular, coarse-grained node dropout can efficiently and systematically enforce dropout at the node (gene) level, while fine-grained edge dropout can further fine-tune the dropout at the interaction (edge) level. We further present a theoretical framework to justify our model architecture. Finally, we conduct extensive experiments on human SL datasets and the results demonstrate the superior performance of our model in comparison with state-of-the-art methods. AVAILABILITY DDGCN is implemented in python 3.7, open-source and freely available at https://github.com/CXX1113/Dual-DropoutGCN.

[1]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[2]  Zhu-Hong You,et al.  Graph convolution for predicting associations between miRNA and drug resistance , 2019, Bioinform..

[3]  Herty Liany,et al.  Predicting Synthetic Lethal Interactions using Heterogeneous Data Sources , 2019, bioRxiv.

[4]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[6]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[7]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[8]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[9]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[10]  Hao Sun,et al.  Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting lncRNA-Disease Associations , 2019, Cells.

[11]  Jiayu Zhou,et al.  Graph convolutional networks for computational drug development and discovery , 2019, Briefings Bioinform..

[12]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[13]  P. Hieter,et al.  Synthetic lethality and cancer , 2017, Nature Reviews Genetics.

[14]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[15]  Lin Lu,et al.  Identification of synthetic lethality based on a functional network by using machine learning algorithms , 2018, Journal of cellular biochemistry.

[16]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[17]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[18]  P Manimaran,et al.  Identification of synthetic lethal pairs in biological systems through network information centrality. , 2013, Molecular bioSystems.

[19]  A. Giaccia,et al.  Harnessing synthetic lethal interactions in anticancer drug discovery , 2011, Nature Reviews Drug Discovery.

[20]  Hui Liu,et al.  SynLethDB: synthetic lethality database toward discovery of selective and sensitive anticancer drug targets , 2015, Nucleic Acids Res..

[21]  Masoumeh Gity,et al.  Metas-Chip precisely identifies presence of micrometastasis in live biopsy samples by label free approach , 2017, Nature Communications.

[22]  Panos Kalnis,et al.  GCN-MF: Disease-Gene Association Identification By Graph Convolutional Networks and Matrix Factorization , 2019, KDD.

[23]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[24]  D. Silver,et al.  Synthetic lethality--a new direction in cancer-drug development. , 2009, The New England journal of medicine.

[25]  Zhongfei Zhang,et al.  Dropout Training of Matrix Factorization and Autoencoder for Link Prediction in Sparse Graphs , 2015, SDM.

[26]  Feiping Nie,et al.  Predicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization , 2012, RECOMB.

[27]  Chunyan Miao,et al.  Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[28]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[29]  Nicholas P. Tatonetti,et al.  Connectivity Homology Enables Inter-Species Network Models of Synthetic Lethality , 2015, PLoS Comput. Biol..

[30]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[31]  Yong Liu,et al.  SL2MF: Predicting Synthetic Lethality in Human Cancers via Logistic Matrix Factorization , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  Kevin Chen-Chuan Chang,et al.  Confidence-aware graph regularization with heterogeneous pairwise features , 2012, SIGIR '12.

[33]  Yoshua Bengio,et al.  Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.

[34]  Subarna Sinha,et al.  Systematic discovery of mutation-specific synthetic lethals by mining pan-cancer human primary tumor data , 2017, Nature Communications.

[35]  Erhardt Barth,et al.  Recurrent Dropout without Memory Loss , 2016, COLING.

[36]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[37]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[38]  Fan Zhang,et al.  Predicting essential genes and synthetic lethality via influence propagation in signaling pathways of cancer cell fates , 2015, J. Bioinform. Comput. Biol..

[39]  Eytan Ruppin,et al.  Predicting Cancer-Specific Vulnerability via Data-Driven Detection of Synthetic Lethality , 2014, Cell.

[40]  David S. Lapointe,et al.  A Synthetic Interaction Screen Identifies Factors Selectively Required for Proliferation and TERT Transcription in p53-Deficient Human Cancer Cells , 2012, PLoS genetics.

[41]  Limsoon Wong,et al.  Inferring synthetic lethal interactions from mutual exclusivity of genetic events in cancer , 2015, Biology Direct.

[42]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[43]  Iñigo Apaolaza,et al.  An in-silico approach to predict and exploit synthetic lethality in cancer metabolism , 2017, Nature Communications.

[44]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[45]  Xiang Deng,et al.  DiscoverSL: an R package for multi‐omic data driven prediction of synthetic lethality in cancers , 2018, Bioinform..

[46]  Roland Arnold,et al.  A negative genetic interaction map in isogenic cancer cell lines reveals cancer cell vulnerabilities , 2013, Molecular systems biology.

[47]  Xiangrong Chen,et al.  Predicting synthetic lethal interactions using conserved patterns in protein interaction networks , 2019, PLoS Comput. Biol..

[48]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[49]  Michael Schwarz,et al.  Synthetic lethality guiding selection of drug combinations in ovarian cancer , 2019, PloS one.