Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction

Synthesis planning and reaction outcome prediction are two fundamental problems in computer-aided organic chemistry for which a variety of data-driven approaches have emerged. Natural language approaches that model each problem as a SMILES-to-SMILES translation lead to a simple end-to-end formulation, reduce the need for data preprocessing, and enable the use of well-optimized machine translation model architectures. However, SMILES representations are not an efficient representation for capturing information about molecular structures, as evidenced by the success of SMILES augmentation to boost empirical performance. Here, we describe a novel Graph2SMILES model that combines the power of Transformer models for text generation with the permutation invariance of molecular graph encoders that mitigates the need for input data augmentation. As an end-to-end architecture, Graph2SMILES can be used as a drop-in replacement for the Transformer in any task involving molecule(s)-to-molecule(s) transformations. In our encoder, an attention-augmented directed message passing neural network (D-MPNN) captures local chemical environments, and the global attention encoder allows for long-range and intermolecular interactions, enhanced by graph-aware positional embedding. Graph2SMILES improves the top-1 accuracy of the Transformer baselines by 1.7% and 1.9% for reaction outcome prediction on USPTO_480k and USPTO_STEREO datasets respectively, and by 9.8% for one-step retrosynthesis on the USPTO_50k dataset.

[1]  Stanislaw Jastrzebski,et al.  Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits , 2020, J. Chem. Inf. Model..

[2]  Junzhou Huang,et al.  Molecular Graph Enhanced Transformer for Retrosynthesis Prediction , 2019, bioRxiv.

[3]  Ling Wang,et al.  Retrosynthesis with Attention-Based NMT Model and Chemical Analysis of the "Wrong" Predictions , 2019, ArXiv.

[4]  Robert L. Grossman,et al.  Assigning Unique Keys to Chemical Compounds for Data Integration: Some Interesting Counter Examples , 2005, DILS.

[5]  Byunghan Lee,et al.  Geometry-aware Transformer for molecular property prediction , 2021, ArXiv.

[6]  Jian Tang,et al.  Non-Autoregressive Electron Redistribution Modeling for Reaction Prediction , 2021, ICML.

[7]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[10]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[11]  Svetha Venkatesh,et al.  Graph Transformation Policy Network for Chemical Reaction Prediction , 2018, KDD.

[12]  Daniel M. Lowe Extraction of chemical structures and reactions from the literature , 2012 .

[13]  T. Huynh-Dinh,et al.  The logic of chemical synthesis , 1996 .

[14]  William H. Green,et al.  Computer-Assisted Retrosynthesis Based on Molecular Similarity , 2017, ACS central science.

[15]  Xing Wang,et al.  Self-Attention with Structural Position Representations , 2019, EMNLP.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Le Song,et al.  Retrosynthesis Prediction with Conditional Graph Logic Network , 2020, NeurIPS.

[18]  Ralph R. Martin,et al.  PCT: Point cloud transformer , 2020, Computational Visual Media.

[19]  Matt J. Kusner,et al.  A Generative Model For Electron Paths , 2018, ICLR.

[20]  Regina Barzilay,et al.  Learning to Make Generalizable and Diverse Predictions for Retrosynthesis , 2019, ArXiv.

[21]  Junhwi Choi,et al.  Graph-Aware Transformer: Is Attention All Graphs Need? , 2020, ArXiv.

[22]  E. Corey,et al.  Robert Robinson Lecture. Retrosynthetic thinking—essentials and examples , 1988 .

[23]  Jun Xu,et al.  Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks , 2019, J. Chem. Inf. Model..

[24]  Youngchun Kwon,et al.  Valid, Plausible, and Diverse Retrosynthesis Using Tied Two-Way Transformers with Latent Variables , 2021, J. Chem. Inf. Model..

[25]  Richard D. Cramer,et al.  Computer-assisted synthetic analysis for complex molecules. Methods and procedures for machine generation of synthetic intermediates , 1972 .

[26]  Di He,et al.  Do Transformers Really Perform Bad for Graph Representation? , 2021, ArXiv.

[27]  Christopher A. Hunter,et al.  Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction , 2018, ACS central science.

[28]  Connor W. Coley,et al.  A graph-convolutional neural network model for the prediction of chemical reactivity , 2018, Chemical science.

[29]  Kevin Gimpel,et al.  Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units , 2016, ArXiv.

[30]  Esben Bjerrum,et al.  Chemformer: a pre-trained transformer for computational chemistry , 2021, Mach. Learn. Sci. Technol..

[31]  Jinwoo Shin,et al.  GTA: Graph Truncated Attention for Retrosynthesis , 2021, AAAI.

[32]  Huanxiang Liu,et al.  RetroPrime: A Diverse, Plausible and Transformer-based Method for Single-Step Retrosynthesis Predictions , 2021 .

[33]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[34]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[36]  Uri Alon,et al.  How Attentive are Graph Attention Networks? , 2021, ArXiv.

[37]  Shuan Chen,et al.  Deep Retrosynthetic Reaction Prediction using Local Reactivity and Global Attention , 2021, JACS Au.

[38]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[39]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[40]  Regina Barzilay,et al.  Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.

[41]  Bowen Liu,et al.  Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models , 2017, ACS central science.

[42]  Yatao Bian,et al.  Self-Supervised Graph Transformer on Large-Scale Molecular Data , 2020, NeurIPS.

[43]  Vadim Sheinin,et al.  SQL-to-Text Generation with Graph-to-Sequence Model , 2018, EMNLP.

[44]  Nicholas A Cilfone,et al.  Enhancing Retrosynthetic Reaction Prediction with Deep Learning Using Multiscale Reaction Classification , 2019, J. Chem. Inf. Model..

[45]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[46]  Esben Jannik Bjerrum,et al.  SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules , 2017, ArXiv.

[47]  Deng Cai,et al.  Graph Transformer for Graph-to-Sequence Learning , 2019, AAAI.

[48]  Alpha A Lee,et al.  Molecular Transformer unifies reaction prediction and retrosynthesis across pharma chemical space. , 2019, Chemical communications.

[49]  Jianwen Chen,et al.  Meta Learning for Low-Resource Molecular Optimization , 2021, J. Chem. Inf. Model..

[50]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.

[51]  Yang Yu,et al.  RetroXpert: Decompose Retrosynthesis Prediction like a Chemist , 2020, NeurIPS.

[52]  Jianfeng Pei,et al.  Automatic retrosynthetic route planning using template-free models , 2020, Chemical science.

[53]  Yansong Feng,et al.  Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks , 2018, ArXiv.

[54]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[55]  Guillaume Godin,et al.  State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis , 2020, Nature communications.

[56]  Tatsuya Akutsu,et al.  Extensions of marginalized graph kernels , 2004, ICML.

[57]  Jian Tang,et al.  A Graph to Graphs Framework for Retrosynthesis Prediction , 2020, ICML.

[58]  Juno Nam,et al.  Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions , 2016, ArXiv.

[59]  Regina Barzilay,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[60]  Ashish Vaswani,et al.  Self-Attention with Relative Position Representations , 2018, NAACL.

[61]  Regina Barzilay,et al.  Learning Graph Models for Template-Free Retrosynthesis , 2020, ArXiv.

[62]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[63]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.