Expert System for Predicting Reaction Conditions: The Michael Reaction Case

A generic chemical transformation may often be achieved under various synthetic conditions. However, for any specific reagents, only one or a few among the reported synthetic protocols may be successful. For example, Michael β-addition reactions may proceed under different choices of solvent (e.g., hydrophobic, aprotic polar, protic) and catalyst (e.g., Brønsted acid, Lewis acid, Lewis base, etc.). Chemoinformatics methods could be efficiently used to establish a relationship between the reagent structures and the required reaction conditions, which would allow synthetic chemists to waste less time and resources in trying out various protocols in search for the appropriate one. In order to address this problem, a number of 2-classes classification models have been built on a set of 198 Michael reactions retrieved from literature. Trained models discriminate between processes that are compatible and respectively processes not feasible under a specific reaction condition option (feasible or not with a Lewis acid catalyst, feasible or not in hydrophobic solvent, etc.). Eight distinct models were built to decide the compatibility of a Michael addition process with each considered reaction condition option, while a ninth model was aimed to predict whether the assumed Michael addition is feasible at all. Different machine-learning methods (Support Vector Machine, Naive Bayes, and Random Forest) in combination with different types of descriptors (ISIDA fragments issued from Condensed Graphs of Reactions, MOLMAP, Electronic Effect Descriptors, and Chemistry Development Kit computed descriptors) have been used. Models have good predictive performance in 3-fold cross-validation done three times: balanced accuracy varies from 0.7 to 1. Developed models are available for the users at http://infochim.u-strasbg.fr/webserv/VSEngine.html . Eventually, these were challenged to predict feasibility conditions for ∼50 novel Michael reactions from the eNovalys database (originally from patent literature).

[1]  J. Schwöbel,et al.  Prediction of michael-type acceptor reactivity toward glutathione. , 2010, Chemical research in toxicology.

[2]  Gilles Marcou,et al.  Mining Chemical Reactions Using Neighborhood Behavior and Condensed Graphs of Reactions Approaches , 2012, J. Chem. Inf. Model..

[3]  Eric V. Anslyn,et al.  Modern Physical Organic Chemistry , 2005 .

[4]  V. O. Kudyshkin,et al.  QSPR Modeling of the Reactivity Parameters of Monomers in Radical Copolymerizations , 2004 .

[5]  Bing Yi,et al.  Quantitative structure–property relationships for the reactivity parameters of acrylate monomers , 2008 .

[6]  L. Hammett The Effect of Structure upon the Reactions of Organic Compounds. Benzene Derivatives , 1937 .

[7]  João Aires-de-Sousa,et al.  Automatic Perception of Chemical Similarities Between Metabolic Pathways , 2012, Molecular informatics.

[8]  Qing-You Zhang,et al.  Genome-scale classification of metabolic reactions and assignment of EC numbers with self-organizing maps , 2008, Bioinform..

[9]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Nina Nikolova-Jeliazkova,et al.  QSAR Applicability Domain Estimation by Projection of the Training Set in Descriptor Space: A Review , 2005, Alternatives to laboratory animals : ATLA.

[12]  D. Horvath,et al.  ISIDA Property‐Labelled Fragment Descriptors , 2010, Molecular informatics.

[13]  Dragos Horvath,et al.  Reactivity Prediction Models Applied to the Selection of Novel Candidate Building Blocks for High-Throughput Organic Synthesis of Combinatorial Libraries , 1999, J. Chem. Inf. Comput. Sci..

[14]  Corwin Hansch,et al.  A Survey of Hammett Substituent Constants and Resonance and Field Parameters , 1991 .

[15]  Jason A. Morrill,et al.  Development of quantitative structure-activity relationships for explanatory modeling of fast reacting (meth)acrylate monomers bearing novel functionality. , 2011, Journal of molecular graphics & modelling.

[16]  Joseph S. Wall,et al.  Additive Linear Free-Energy Relationships in Reaction Kinetics of Amino Groups with α,β-Unsaturated Compounds1,2 , 1966 .

[17]  Diogo A. R. S. Latino,et al.  Assignment of EC Numbers to Enzymatic Reactions with MOLMAP Reaction Descriptors and Random Forests , 2009, J. Chem. Inf. Model..

[18]  João Aires-de-Sousa,et al.  JATOON: Java tools for neural networks , 2002 .

[19]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[20]  J. Hiltunen,et al.  The crystal structure of enoyl-CoA hydratase complexed with octanoyl-CoA reveals the structural adaptations required for binding of a long chain fatty acid-CoA molecule. , 1998, Journal of molecular biology.

[21]  André Mortreux,et al.  Olefin Metathesis and Related Reactions , 1988 .

[22]  Gerrit Schüürmann,et al.  Local Electrophilicity Predicts the Toxicity-Relevant Reactivity of Michael Acceptors , 2010 .

[23]  Timothy E. Long,et al.  Michael addition reactions in macromolecular design for emerging technologies , 2006 .

[24]  D. Fourches,et al.  Successful “In Silico” Design of New Efficient Uranyl Binders , 2007 .

[25]  Igor I. Baskin,et al.  Quantitative structure–conditions–property relationship studies. Neural network modelling of the acid hydrolysis of esters , 2002 .

[26]  Joseph S. Wall,et al.  Application of a Hammett-Taft Relation to Kinetics of Alkylation of Amino Acid and Peptide Model Compounds with Acrylonitrile2 , 1964 .

[27]  T. Schmidt,et al.  Structure-Activity Relationships of Sesquiterpene Lactones , 2006 .

[28]  Igor I. Baskin,et al.  Structure-reactivity relationships in terms of the condensed graphs of reactions , 2014, Russian Journal of Organic Chemistry.

[29]  D. Mcdaniel,et al.  An Extended Table of Hammett Substitutent Constants Based on the Ionization of Substituted Benzoic Acids , 1958 .

[30]  Dragos Horvath,et al.  Electrochemical properties of substituted 2-methyl-1,4-naphthoquinones: redox behavior predictions. , 2015, Chemistry.

[31]  I. Tetko,et al.  ISIDA - Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors , 2008 .

[32]  Paul Watson,et al.  Naïve Bayes Classification Using 2D Pharmacophore Feature Triplet Vectors , 2008, J. Chem. Inf. Model..

[33]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[34]  Joseph S. Wall,et al.  Relative Nucleophilic Reactivities of Amino Groups and Mercaptide Ions in Addition Reactions with α,β-Unsaturated Compounds1,2 , 1965 .

[35]  Dragos Horvath,et al.  Models for Identification of Erroneous Atom-to-Atom Mapping of Reactions Performed by Automated Algorithms , 2012, J. Chem. Inf. Model..

[36]  Alexandre Varnek,et al.  Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures , 2005, J. Comput. Aided Mol. Des..

[37]  J. Aires-de-Sousa,et al.  Genome-scale classification of metabolic reactions: a chemoinformatics approach. , 2006, Angewandte Chemie.

[38]  Wendy A Warr,et al.  A Short Review of Chemical Reaction Database Systems, Computer‐Aided Synthesis Design, Reaction Prediction and Synthetic Feasibility , 2014, Molecular informatics.

[39]  Gerta Rücker,et al.  y-Randomization and Its Variants in QSPR/QSAR , 2007, J. Chem. Inf. Model..

[40]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[41]  Qing-You Zhang,et al.  Structure-Based Classification of Chemical Reactions without Assignment of Reaction Centers , 2005, J. Chem. Inf. Model..

[42]  Andreas Natsch,et al.  High throughput kinetic profiling approach for covalent binding to peptides: application to skin sensitization potency of Michael acceptor electrophiles. , 2009, Chemical research in toxicology.