Bayesian reaction optimization as a tool for chemical synthesis

Reaction optimization is fundamental to synthetic chemistry, from optimizing the yield of industrial processes to selecting conditions for the preparation of medicinal candidates1. Likewise, parameter optimization is omnipresent in artificial intelligence, from tuning virtual personal assistants to training social media and product recommendation systems2. Owing to the high cost associated with carrying out experiments, scientists in both areas set numerous (hyper)parameter values by evaluating only a small subset of the possible configurations. Bayesian optimization, an iterative response surface-based global optimization algorithm, has demonstrated exceptional performance in the tuning of machine learning models3. Bayesian optimization has also been recently applied in chemistry4-9; however, its application and assessment for reaction optimization in synthetic chemistry has not been investigated. Here we report the development of a framework for Bayesian reaction optimization and an open-source software tool that allows chemists to easily integrate state-of-the-art optimization algorithms into their everyday laboratory practices. We collect a large benchmark dataset for a palladium-catalysed direct arylation reaction, perform a systematic study of Bayesian optimization compared to human decision-making in reaction optimization, and apply Bayesian optimization to two real-world optimization efforts (Mitsunobu and deoxyfluorination reactions). Benchmarking is accomplished via an online game that links the decisions made by expert chemists and engineers to real experiments run in the laboratory. Our findings demonstrate that Bayesian optimization outperforms human decisionmaking in both average optimization efficiency (number of experiments) and consistency (variance of outcome against initially available data). Overall, our studies suggest that adopting Bayesian optimization methods into everyday laboratory practices could facilitate more efficient synthesis of functional chemicals by enabling better-informed, data-driven decisions about which experiments to run.

[1]  Alán Aspuru-Guzik,et al.  Phoenics: A Bayesian Optimizer for Chemistry , 2018, ACS central science.

[2]  J. T. Njardarson,et al.  Analysis of the structural diversity, substitution patterns, and frequency of nitrogen heterocycles among U.S. FDA approved pharmaceuticals. , 2014, Journal of medicinal chemistry.

[3]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[4]  Daniel Reker,et al.  Adaptive Optimization of Chemical Reactions with Minimal Experimental Information , 2020, Cell Reports Physical Science.

[5]  Carlos Mateos,et al.  Automated platforms for reaction self-optimization in flow , 2019, Reaction Chemistry & Engineering.

[6]  Martin D. Eastgate,et al.  C-H Arylation in the Formation of a Complex Pyrrolopyridine, the Commercial Synthesis of the Potent JAK2 Inhibitor, BMS-911543. , 2018, The Journal of organic chemistry.

[7]  Neal G. Anderson,et al.  Design of Experiments (DoE) and Process Optimization. A Review of Recent Publications , 2015 .

[8]  Alán Aspuru-Guzik,et al.  Next-Generation Experimentation with Self-Driving Laboratories , 2019, Trends in Chemistry.

[9]  E. Balaraman,et al.  Mitsunobu and related reactions: advances and applications. , 2009, Chemical reviews.

[10]  Michael Schmidt,et al.  Mono-Oxidation of Bidentate Bis-phosphines in Catalyst Activation: Kinetic and Mechanistic Studies of a Pd/Xantphos-Catalyzed C-H Functionalization. , 2015, Journal of the American Chemical Society.

[11]  S. Takizawa,et al.  Exploration of flow reaction conditions using machine-learning for enantioselective organocatalyzed Rauhut-Currier and [3+2] annulation sequence. , 2020, Chemical communications.

[12]  Marianthi G. Ierapetritou,et al.  Feasibility and flexibility analysis of black-box processes Part 1: Surrogate-based feasibility analysis , 2015 .

[13]  Kevin Bateman,et al.  Nanomole-scale high-throughput chemistry for the synthesis of complex molecules , 2015, Science.

[14]  Koji Tsuda,et al.  COMBO: An efficient Bayesian optimization library for materials science , 2016 .

[15]  Gang Luo,et al.  A review of automatic selection methods for machine learning algorithms and hyper-parameter values , 2016, Network Modeling Analysis in Health Informatics and Bioinformatics.

[16]  Warren B. Powell,et al.  The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery , 2011, INFORMS J. Comput..

[17]  Gisbert Schneider,et al.  Active-learning strategies in computer-assisted drug discovery. , 2015, Drug discovery today.

[18]  Derek T. Ahneman,et al.  Deoxyfluorination with Sulfonyl Fluorides: Navigating Reaction Space with Machine Learning. , 2018, Journal of the American Chemical Society.

[19]  W. Hagmann,et al.  The many roles for fluorine in medicinal chemistry. , 2008, Journal of medicinal chemistry.

[20]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[21]  Artur M. Schweidtmann,et al.  Machine learning meets continuous flow chemistry: Automated optimization towards the Pareto front of multiple objectives , 2018, Chemical Engineering Journal.

[22]  Klavs F. Jensen,et al.  Photoredox Iridium–Nickel Dual-Catalyzed Decarboxylative Arylation Cross-Coupling: From Batch to Continuous Flow via Self-Optimizing Segmented Flow Reactor , 2018 .

[23]  Mark E. Scott,et al.  Aryl-aryl bond formation by transition-metal-catalyzed direct arylation. , 2007, Chemical reviews.

[24]  S. Fletcher The Mitsunobu reaction in the 21st century , 2015 .

[25]  O. Mitsunobu,et al.  Preparation of Esters of Carboxylic and Phosphoric Acid via Quaternary Phosphonium Salts , 1967 .

[26]  Reiner Sebastian Sprick,et al.  A mobile robotic chemist , 2020, Nature.

[27]  M. D. Hill,et al.  Applications of Fluorine in Medicinal Chemistry. , 2015, Journal of medicinal chemistry.

[28]  L. Hunter,et al.  Recent Developments in the Deoxyfluorination of Alcohols and Phenols: New Reagents, Mechanistic Insights, and Applications , 2017, Synthesis.

[29]  D. Morton,et al.  Recent Advances in C-H Functionalization. , 2016, The Journal of organic chemistry.

[30]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[31]  Paul M. Murray,et al.  The application of design of experiments (DoE) reaction optimisation and solvent selection in the development of new synthetic chemistry. , 2016, Organic & biomolecular chemistry.

[32]  Ryan-Rhys Griffiths,et al.  Constrained Bayesian optimization for automatic chemical design using variational autoencoders , 2019, Chemical science.

[33]  Ruth Misener,et al.  GPdoemd: a python package for design of experiments for model discrimination , 2018, Comput. Chem. Eng..

[34]  Peter I. Frazier,et al.  Parallel Bayesian Global Optimization of Expensive Functions , 2016, Oper. Res..

[35]  Tatsuya Takagi,et al.  Mordred: a molecular descriptor calculator , 2018, Journal of Cheminformatics.

[36]  Melanie S Sanford,et al.  Palladium-catalyzed ligand-directed C-H functionalization reactions. , 2010, Chemical reviews.

[37]  Andrew J. deMello,et al.  Fast and Reliable Metamodeling of Complex Reaction Spaces Using Universal Kriging , 2014 .

[38]  Derek J Durand,et al.  Computational Ligand Descriptors for Catalyst Design. , 2019, Chemical reviews.

[39]  Marianthi G. Ierapetritou,et al.  Feasibility analysis of black-box processes using an adaptive sampling Kriging-based method , 2012, Comput. Chem. Eng..

[40]  Derek T. Ahneman,et al.  Predicting reaction performance in C–N cross-coupling using machine learning , 2018, Science.

[41]  Robert Lee,et al.  Statistical Design of Experiments for Screening and Optimization , 2019, Chemie Ingenieur Technik.

[42]  Richard N. Zare,et al.  Optimizing Chemical Reactions with Deep Reinforcement Learning , 2017, ACS central science.

[43]  A. Doyle,et al.  PyFluor: A Low-Cost, Stable, and Selective Deoxyfluorination Reagent. , 2015, Journal of the American Chemical Society.

[44]  J. Mockus On the Bayes Methods for Seeking the Extremal Point , 1975 .

[45]  Jonathan Grizou,et al.  Human versus Robots in the Discovery and Crystallization of Gigantic Polyoxometalates , 2017, Angewandte Chemie.

[46]  Brian A. Taylor,et al.  Algorithms for the self-optimisation of chemical reactions , 2019, Reaction Chemistry & Engineering.

[47]  Paul Richardson,et al.  A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow , 2018, Science.

[48]  Erik Johansson,et al.  Generalized Subset Designs in Analytical Chemistry. , 2017, Analytical chemistry.