Finding the K best synthesis plans

In synthesis planning, the goal is to synthesize a target molecule from available starting materials, possibly optimizing costs such as price or environmental impact of the process. Current algorithmic approaches to synthesis planning are usually based on selecting a bond set and finding a single good plan among those induced by it. We demonstrate that synthesis planning can be phrased as a combinatorial optimization problem on hypergraphs by modeling individual synthesis plans as directed hyperpaths embedded in a hypergraph of reactions (HoR) representing the chemistry of interest. As a consequence, a polynomial time algorithm to find the K shortest hyperpaths can be used to compute the K best synthesis plans for a given target molecule. Having K good plans to choose from has many benefits: it makes the synthesis planning process much more robust when in later stages adding further chemical detail, it allows one to combine several notions of cost, and it provides a way to deal with imprecise yield estimates. A bond set gives rise to a HoR in a natural way. However, our modeling is not restricted to bond set based approaches—any set of known reactions and starting materials can be used to define a HoR. We also discuss classical quality measures for synthesis plans, such as overall yield and convergency, and demonstrate that convergency has a built-in inconsistency which could render its use in synthesis planning questionable. Decalin is used as an illustrative example of the use and implications of our results.

[1]  Stephen Hanessian,et al.  The psychobiological basis of heuristic synthesis planning - man, machine and the chiron approach , 1990 .

[2]  Daniele Frigioni,et al.  Directed Hypergraphs: Problems, Algorithmic Results, and a Novel Decremental Approach , 2001, ICTCS.

[3]  Leon Velluz,et al.  Spatial Arrangement and Preparative Organic Synthesis , 1967 .

[4]  J. Y. Yen,et al.  Finding the K Shortest Loopless Paths in a Network , 2007 .

[5]  Ben Bradshaw,et al.  The Wieland—Miescher Ketone: A Journey from Organocatalysis to Natural Product Synthesis , 2012 .

[6]  Matthew H Todd,et al.  Computer-aided organic synthesis. , 2005, Chemical Society reviews.

[7]  William H. Green,et al.  Computer-Assisted Retrosynthesis Based on Molecular Similarity , 2017, ACS central science.

[8]  Ben Bradshaw,et al.  The Wieland-MiescherKetone: A Journey from Organocatalysis to Natural Product Synthesis , 2012 .

[9]  John Andraos,et al.  The Algebra of Organic Synthesis: Green Metrics, Design Strategy, Route Selection, and Optimization , 2011 .

[10]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[11]  Michel Chanon,et al.  Computer‐Aided Organic Synthesis – SESAM: A Simple Program to Unravel “Hidden” Restructured Starting Materials Skeleta in Complex Targets , 1998 .

[12]  Bowen Liu,et al.  Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models , 2017, ACS central science.

[13]  Warren D. Smith Computational Complexity of Synthetic Chemistry { Basic Facts , 2007 .

[14]  Mark Moll,et al.  A review of parameters and heuristics for guiding metabolic pathfinding , 2017, Journal of Cheminformatics.

[15]  Giorgio Gallo,et al.  Hypergraph models and algorithms for the assembly problem , 1992 .

[16]  Giorgio Gallo,et al.  Directed Hypergraphs and Applications , 1993, Discret. Appl. Math..

[17]  Rahul Tripathi,et al.  Linear connectivity problems in directed hypergraphs , 2009, Theor. Comput. Sci..

[18]  Daniele Pretolani,et al.  Finding the K shortest hyperpaths , 2005, Comput. Oper. Res..

[19]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[20]  James B. Hendrickson,et al.  Systematic synthesis design. 6. Yield analysis and convergency , 1977 .

[21]  G. Pólya Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen , 1937 .

[22]  R. W. Hoffmann Elements of Synthesis Planning , 2009 .

[23]  Pablo Carbonell,et al.  Enumerating metabolic pathways for the production of heterologous target chemicals in chassis organisms , 2012, BMC Systems Biology.

[24]  A F Sanders,et al.  Empirical Explorations of SYNCHEM , 1977, Science.

[25]  James B. Hendrickson,et al.  A logic for synthesis design , 1981 .

[26]  Grzegorz Fic,et al.  Generation of Chemical Transformations: Reaction Pathways Prediction and Synthesis Design , 2013 .

[27]  James B. Hendrickson,et al.  Generating Benign Alternative Syntheses: The SynGen Program , 2002 .

[28]  T. Huynh-Dinh,et al.  The logic of chemical synthesis , 1996 .

[29]  W. Todd Wipke,et al.  Artificial intelligence in organic synthesis. SST: starting material selection strategies. An application of superstructure search , 1984, J. Chem. Inf. Comput. Sci..

[30]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[31]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[32]  Valerie J. Gillet,et al.  SPROUT, HIPPO and CAESA: Tools for de novo structure generation and estimation of synthetic accessibility , 1995 .

[33]  Steven H. Bertz,et al.  Complexity of synthetic routes: Linear, convergent and reflexive syntheses , 2003 .

[34]  G. A. Petersson,et al.  General methods of synthetic analysis. Strategic bond disconnections for bridged polycyclic structures , 1975 .

[35]  Gerta Rücker,et al.  Organic Synthesis - Art or Science? , 2004, J. Chem. Inf. Model..