Computational Chemical Synthesis Analysis and Pathway Design

With the idea of retrosynthetic analysis, which was raised in the 1960s, chemical synthesis analysis and pathway design have been transformed from a complex problem to a regular process of structural simplification. This review aims to summarize the developments of computer-assisted synthetic analysis and design in recent years, and how machine-learning algorithms contributed to them. LHASA system started the pioneering work of designing semi-empirical reaction modes in computers, with its following rule-based and network-searching work not only expanding the databases, but also building new approaches to indicating reaction rules. Programs like ARChem Route Designer replaced hand-coded reaction modes with automatically-extracted rules, and programs like Chematica changed traditional designing into network searching. Afterward, with the help of machine learning, two-step models which combine reaction rules and statistical methods became the main stream. Recently, fully data-driven learning methods using deep neural networks which even do not require any prior knowledge, were applied into this field. Up to now, however, these methods still cannot replace experienced human organic chemists due to their relatively low accuracies. Future new algorithms with the aid of powerful computational hardware will make this topic promising and with good prospects.

[1]  João Aires-de-Sousa,et al.  Machine learning of chemical reactivity from databases of organic reactions , 2009, J. Comput. Aided Mol. Des..

[2]  Susumu Goto,et al.  PathPred: an enzyme-catalyzed metabolic pathway prediction server , 2010, Nucleic Acids Res..

[3]  Rainer Herges,et al.  Computer-assisted solution of chemical problems : the historical development and the present state of the art of a new discipline of chemistry , 1993 .

[4]  Regina Barzilay,et al.  Deriving Neural Architectures from Sequence and Graph Kernels , 2017, ICML.

[5]  C. Y. Lee An Algorithm for Path Connections and Its Applications , 1961, IRE Trans. Electron. Comput..

[6]  Harpreet Singh,et al.  Software and Web Resources for Computer-Aided Molecular Modeling and Drug Discovery , 2016 .

[7]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[8]  Pierre Baldi,et al.  ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning , 2012, J. Chem. Inf. Model..

[9]  Yang Liu,et al.  Route Designer: A Retrosynthetic Analysis Tool Utilizing Automated Retrosynthetic Rule Generation , 2009, J. Chem. Inf. Model..

[10]  S. Krishnan,et al.  Simulation and Evaluation of Chemical Synthesis - SECS: An Application of Artificial Intelligence Techniques , 1978, Artif. Intell..

[11]  Grazyna Nowak,et al.  The CSB approach to prediction of chemical reactions , 2005 .

[12]  Alán Aspuru-Guzik,et al.  Neural Networks for the Prediction of Organic Chemistry Reactions , 2016, ACS central science.

[13]  B. Grzybowski,et al.  The 'wired' universe of organic chemistry. , 2009, Nature chemistry.

[14]  Daniel M. Lowe,et al.  Development of a Novel Fingerprint for Chemical Reactions and Its Application to Large-Scale Reaction Classification and Similarity , 2015, J. Chem. Inf. Model..

[15]  M. Fiałkowski,et al.  Architecture and evolution of organic chemistry. , 2005, Angewandte Chemie.

[16]  Henry M. Leicester,et al.  A source book in chemistry, 1400-1900 , 1952 .

[17]  Daniel M. Lowe Extraction of chemical structures and reactions from the literature , 2012 .

[18]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[19]  Kimito Funatsu,et al.  A Novel Approach to Retrosynthetic Analysis Using Knowledge Bases Derived from Reaction Databases , 1999, J. Chem. Inf. Comput. Sci..

[20]  Marwin H. S. Segler,et al.  Modelling Chemical Reasoning to Predict Reactions , 2016, Chemistry.

[21]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22]  Stephen Hanessian,et al.  The psychobiological basis of heuristic synthesis planning - man, machine and the chiron approach , 1990 .

[23]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[24]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[25]  W. T. Wipke,et al.  Computer-assisted synthetic analysis. Facile man-machine communication of chemical structure by interactive computer graphics , 1972 .

[26]  Regina Barzilay,et al.  Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.

[27]  Bowen Liu,et al.  Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models , 2017, ACS central science.

[28]  Regina Barzilay,et al.  Prediction of Organic Reaction Outcomes Using Machine Learning , 2017, ACS central science.

[29]  Peter D. Karp,et al.  Machine learning methods for metabolic pathway prediction , 2010 .

[30]  Joshua Lederberg,et al.  DENDRAL: A Case Study of the First Expert System for Scientific Hypothesis Formation , 1993, Artif. Intell..

[31]  Hans Georg Schaathun Simulation and Evaluation , 2012 .

[32]  Pablo R. Duchowicz,et al.  Software and Web Resources for Computer-Aided Molecular Modeling and Drug Discovery , 2016 .

[33]  Lars Carlsson,et al.  Stereo Signature Molecular Descriptor , 2013, J. Chem. Inf. Model..

[34]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[35]  Lynda B. M. Ellis,et al.  The University of Minnesota Pathway Prediction System: multi-level prediction and visualization , 2011, Nucleic Acids Res..

[36]  Pierre Baldi,et al.  Learning to Predict Chemical Reactions , 2011, J. Chem. Inf. Model..

[37]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.

[38]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[39]  Gilbert Chin,et al.  Fast and Accurate , 2005 .

[40]  Gregory A Landrum,et al.  What's What: The (Nearly) Definitive Guide to Reaction Role Assignment , 2016, J. Chem. Inf. Model..

[41]  Edward S. Blurock Computer-aided synthesis design at RISC-Linz: automatic extraction and use of reaction classes , 1990, J. Chem. Inf. Comput. Sci..

[42]  Chyouhwa Chen,et al.  Building and refining a knowledge base for synthetic organic chemistry via the methodology of inductive and deductive machine learning , 1990, J. Chem. Inf. Comput. Sci..

[43]  Johann Gasteiger,et al.  Simulation of Organic Reactions: From the Degradation of Chemicals to Combinatorial Synthesis , 2000, J. Chem. Inf. Comput. Sci..

[44]  B. Grzybowski,et al.  The core and most useful molecules in organic chemistry. , 2006, Angewandte Chemie.

[45]  Richard D. Cramer,et al.  Computer-assisted synthetic analysis for complex molecules. Methods and procedures for machine generation of synthetic intermediates , 1972 .

[46]  Juno Nam,et al.  Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions , 2016, ArXiv.

[47]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[48]  E. Corey,et al.  Robert Robinson Lecture. Retrosynthetic thinking—essentials and examples , 1988 .