Predicting Electron Paths

Chemical reactions can be described as the stepwise redistribution of electrons in molecules. As such, reactions are often depicted using "arrow-pushing" diagrams which show this movement as a sequence of arrows. We propose an electron path prediction model (ELECTRO) to learn these sequences directly from raw reaction data. Instead of predicting product molecules directly from reactant molecules in one shot, learning a model of electron movement has the benefits of (a) being easy for chemists to interpret, (b) incorporating constraints of chemistry, such as balanced atom counts before and after the reaction, and (c) naturally encoding the sparsity of chemical reactions, which usually involve changes in only a small number of atoms in the reactants. We design a method to extract approximate reaction paths from any dataset of atom-mapped reaction SMILES strings. Our model achieves state-of-the-art results on a subset of the UPSTO reaction dataset. Furthermore, we show that our model recovers a basic knowledge of chemistry without being explicitly trained to do so.

[1]  Daniel M. Lowe Extraction of chemical structures and reactions from the literature , 2012 .

[2]  Pierre Baldi,et al.  Learning to Predict Chemical Reactions , 2011, J. Chem. Inf. Model..

[3]  Cooper J. Galvin,et al.  Complex Chemical Reaction Networks from Heuristics-Aided Quantum Chemistry. , 2014, Journal of chemical theory and computation.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[6]  Constantine Bekas,et al.  “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models† †Electronic supplementary information (ESI) available: Time-split test set and example predictions, together with attention weights, confidence and token probabilities. See DO , 2017, Chemical science.

[7]  Regina Barzilay,et al.  Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.

[8]  Markus Reiher,et al.  Context-Driven Exploration of Complex Chemical Reaction Networks. , 2017, Journal of chemical theory and computation.

[9]  Pierre Baldi,et al.  A Machine Learning Approach to Predict Chemical Reactions , 2011, NIPS.

[10]  Regina Barzilay,et al.  Prediction of Organic Reaction Outcomes Using Machine Learning , 2017, ACS central science.

[11]  Rainer Herges,et al.  Coarctate transition states: the discovery of a reaction principle , 1994, J. Chem. Inf. Comput. Sci..

[12]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[13]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[14]  Pierre Baldi,et al.  ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning , 2012, J. Chem. Inf. Model..

[15]  Gregor Urban,et al.  Deep learning for chemical reaction prediction , 2018 .

[16]  Rainer Herges Organizing Principle of Complex Reactions and Theory of Coarctate Transition States , 1994 .

[17]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[18]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[19]  Markus Reiher,et al.  Heuristics-Guided Exploration of Reaction Mechanisms. , 2015, Journal of chemical theory and computation.

[20]  Woo Youn Kim,et al.  Efficient prediction of reaction paths through molecular graph and reaction network analysis† †Electronic supplementary information (ESI) available: Detailed information on reaction networks and pathways for two example reactions, Cartesian coordinates of molecules in the reaction networks obtained , 2017, Chemical science.

[21]  Paul M. Zimmerman,et al.  Automated discovery of chemically reasonable elementary reaction steps , 2013, J. Comput. Chem..

[22]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[23]  Alán Aspuru-Guzik,et al.  Neural Networks for the Prediction of Organic Chemistry Reactions , 2016, ACS central science.

[24]  Mark P. Waller,et al.  A tabu-search based strategy for modeling molecular aggregates and binary reactions , 2017 .

[25]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.