MoCL: Contrastive Learning on Molecular Graphs with Multi-level Domain Knowledge

Recent years have seen a rapid growth of utilizing graph neural networks (GNNs) in the biomedical domain for tackling drug-related problems. However, like any other deep architectures, GNNs are data hungry. While requiring labels in real world is often expensive, pretraining GNNs in an unsupervised manner has been actively explored. Among them, graph contrastive learning, by maximizing the mutual information between paired graph augmentations, has been shown to be effective on various downstream tasks. However, the current graph contrastive learning framework has two limitations. First, the augmentations are designed for general graphs and thus may not be suitable or powerful enough for certain domains. Second, the contrastive scheme only learns representations that are invariant to local perturbations and thus does not consider the global structure of the dataset, which may also be useful for downstream tasks. Therefore, in this paper, we study graph contrastive learning in the context of biomedical domain, where molecular graphs are present. We propose a novel framework called MoCL, which utilizes domain knowledge at both localand global-level to assist representation learning. The local-level domain knowledge guides the augmentation process such that variation is introduced without changing graph semantics. The global-level knowledge encodes the similarity information between graphs in the entire dataset and helps to learn representations with richer semantics. The entire model is learned through a double contrast objective. We evaluate MoCL on various molecular datasets under both linear and semi-supervised settings and results show that MoCL achieves state-of-the-art performance. CCS CONCEPTS •Computingmethodologies→Machine learning algorithms; • Applied computing→ Bioinformatics. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. KDD ’21, August 14–18, 2021, Virtual Event, Singapore © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8332-5/21/08. . . $15.00 https://doi.org/10.1145/3447548.3467186

[1]  Kaveh Hassani,et al.  Contrastive Multi-View Representation Learning on Graphs , 2020, ICML.

[2]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Jian Tang,et al.  InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization , 2019, ICLR.

[4]  Yizhou Sun,et al.  Unsupervised Inductive Whole-Graph Embedding by Preserving Graph Proximity , 2019, ArXiv.

[5]  Antony J. Williams,et al.  ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology. , 2016, Chemical research in toxicology.

[6]  Alexander A. Alemi,et al.  On Variational Bounds of Mutual Information , 2019, ICML.

[7]  Károly Héberger,et al.  Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? , 2015, Journal of Cheminformatics.

[8]  N. Meanwell Synopsis of some recent tactical application of bioisosteres in drug design. , 2011, Journal of medicinal chemistry.

[9]  Vijay S. Pande,et al.  Computational Modeling of β-Secretase 1 (BACE-1) Inhibitors Using Ligand Based Approaches , 2016, J. Chem. Inf. Model..

[10]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[11]  Yatao Bian,et al.  Self-Supervised Graph Transformer on Large-Scale Molecular Data , 2020, NeurIPS.

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[14]  Tianlong Chen,et al.  When Does Self-Supervision Help Graph Convolutional Networks? , 2020, ICML.

[15]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[16]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[17]  Andrea Vedaldi,et al.  Self-labelling via simultaneous clustering and representation learning , 2020, ICLR.

[18]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[19]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[20]  Percy Liang,et al.  Graph-based, Self-Supervised Program Repair from Diagnostic Feedback , 2020, ICML.

[21]  G. Klambauer,et al.  Graph networks for molecular design , 2020, Mach. Learn. Sci. Technol..

[22]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[23]  Qiang Liu,et al.  Graph Contrastive Learning with Adaptive Augmentation , 2020, WWW.

[24]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[25]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[26]  Minnan Luo,et al.  Self-Supervised Graph Representation Learning via Global Context Prediction , 2020, ArXiv.

[27]  M M Tai,et al.  A Mathematical Model for the Determination of Total Area Under Glucose Tolerance and Other Metabolic Curves , 1994, Diabetes Care.

[28]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[29]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[30]  Luis Pinheiro,et al.  A Bayesian Approach to in Silico Blood-Brain Barrier Penetration Modeling , 2012, J. Chem. Inf. Model..

[31]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[32]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[33]  Ziniu Hu,et al.  Motif-Driven Contrastive Learning of Graph Representations , 2020, AAAI.

[34]  Zhanxing Zhu,et al.  Multi-Stage Self-Supervised Learning for Graph Convolutional Networks , 2020, AAAI.

[35]  Yuxiao Dong,et al.  GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training , 2020, KDD.

[36]  Jimeng Sun,et al.  Pre-training of Graph Augmented Transformers for Medication Recommendation , 2019, IJCAI.

[37]  Kaitlyn M. Gayvert,et al.  A Data-Driven Approach to Predicting Successes and Failures of Clinical Trials. , 2016, Cell chemical biology.

[38]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[39]  H. Shirakawa,et al.  Prediction of pharmacological activities from chemical structures with graph convolutional neural networks , 2021, Scientific Reports.

[40]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[41]  Zhangyang Wang,et al.  Graph Contrastive Learning with Augmentations , 2020, NeurIPS.

[42]  Vijay S. Pande,et al.  SWEETLEAD: an In Silico Database of Approved Drugs, Regulated Chemicals, and Herbal Isolates for Computer-Aided Drug Discovery , 2013, PloS one.

[43]  Michael Tschannen,et al.  On Mutual Information Maximization for Representation Learning , 2019, ICLR.

[44]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[45]  Yizhou Sun,et al.  GPT-GNN: Generative Pre-Training of Graph Neural Networks , 2020, KDD.

[46]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.