Are Learned Molecular Representations Ready For Prime Time?

For Prime Time? Kevin Yang,∗,† Kyle Swanson,∗,† Wengong Jin,† Connor Coley,‡ Philipp Eiden,¶ Hua Gao,§ Angel Guzman-Perez,§ Timothy Hopper,§ Brian Kelley,‖ Miriam Mathea,¶ Andrew Palmer,¶ Volker Settels,¶ Tommi Jaakkola,† Klavs Jensen,‡ and Regina Barzilay† †Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA ‡Department of Chemical Engineering, MIT, Cambridge, MA ¶BASF SE, Ludwigshafen, Germany §Amgen Inc., Cambridge, MA ‖Novartis Institutes for BioMedical Research, Cambridge, MA

[1]  Tatsuya Takagi,et al.  Mordred: a molecular descriptor calculator , 2018, Journal of Cheminformatics.

[2]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[3]  Lei Jia,et al.  Chemi-net: a graph convolutional network for accurate drug property prediction , 2018, ArXiv.

[4]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[5]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[6]  Bing Huang,et al.  Machine learning prediction errors better than DFT accuracy , 2017, 1702.05532.

[7]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[8]  Manuela Pavan,et al.  DRAGON SOFTWARE: AN EASY APPROACH TO MOLECULAR DESCRIPTOR CALCULATIONS , 2006 .

[9]  Katsuhiko Ishiguro,et al.  Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks , 2019, ArXiv.

[10]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[11]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[12]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[13]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[14]  Regina Barzilay,et al.  Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction , 2017, J. Chem. Inf. Model..

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  Tatsuya Akutsu,et al.  Extensions of marginalized graph kernels , 2004, ICML.

[17]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[18]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[19]  Chris Hans Bayesian lasso regression , 2009 .

[20]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[21]  Pierre Baldi,et al.  Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity , 2005, ISMB.

[22]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[23]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[24]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[25]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[26]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[27]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[28]  David A. Price,et al.  Ligand biological activity predicted by cleaning positive and negative chemical correlations , 2019, Proceedings of the National Academy of Sciences.

[29]  Dong-Sheng Cao,et al.  ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..

[30]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[31]  Alessandro Sperduti,et al.  Pre-training Graph Neural Networks with Kernels , 2018, ArXiv.

[32]  William Stafford Noble,et al.  Support vector machine , 2013 .

[33]  Pierre Baldi,et al.  Influence Relevance Voting: An Accurate And Interpretable Virtual High Throughput Screening Method , 2009, J. Chem. Inf. Model..

[34]  Regina Barzilay,et al.  Learning Multimodal Graph-to-Graph Translation for Molecular Optimization , 2018, ICLR.

[35]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[36]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..

[37]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[38]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[39]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[40]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[41]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[42]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[43]  Mario Medvedovic,et al.  LRpath: a logistic regression approach for identifying enriched biological groups in gene expression data , 2009, Bioinform..

[44]  Yang Li,et al.  PotentialNet for Molecular Property Prediction , 2018, ACS central science.

[45]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL† †Electronic supplementary information (ESI) available: Overview, Data Collection and Clustering, Methods, Results, Appendix. See DOI: 10.1039/c8sc00148k , 2018, Chemical science.

[46]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[47]  Regina Barzilay,et al.  Deriving Neural Architectures from Sequence and Graph Kernels , 2017, ICML.

[48]  Andreas Verras,et al.  Is Multitask Deep Learning Practical for Pharma? , 2017, J. Chem. Inf. Model..

[49]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[50]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[51]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[52]  Lawrence E. Barker,et al.  Logit Models From Economics and Other Fields , 2005, Technometrics.

[53]  Risi Kondor,et al.  Covariant Compositional Networks For Learning Graphs , 2018, ICLR.

[54]  Robert P. Sheridan,et al.  Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective Prediction , 2013, J. Chem. Inf. Model..

[55]  Abhinav Vishnu,et al.  Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction , 2017, KDD.