Predicting Solvent-Dependent Nucleophilicity Parameter with a Causal Structure Property Relationship

Solvent-dependent reactivity is a key aspect of synthetic science, which controls reaction selectivity. The contemporary focus on new, sustainable solvents highlights a need for reactivity predictions in different solvents. Herein, we report the excellent machine learning prediction of the nucleophilicity parameter N in the four most-common solvents for nucleophiles in the Mayr's reactivity parameter database (R2 = 0.93 and 81.6% of predictions within ±2.0 of the experimental values with Extra Trees algorithm). A Causal Structure Property Relationship (CSPR) approach was utilized, with focus on the physicochemical relationships between the descriptors and the predicted parameters, and on rational improvements of the prediction models. The nucleophiles were represented with a series of electronic and steric descriptors and the solvents were represented with principal component analysis (PCA) descriptors based on the ACS Solvent Tool. The models indicated that steric factors do not contribute significantly, because of bias in the experimental database. The most important descriptors are solvent-dependent HOMO energy and Hirshfeld charge of the nucleophilic atom. Replacing DFT descriptors with Parameterization Method 6 (PM6) descriptors for the nucleophiles led to an 8.7-fold decrease in computational time, and an ∼10% decrease in the percentage of predictions within ±2.0 and ±1.0 of the experimental values.

[1]  C. A. Tolman,et al.  Steric effects of phosphorus ligands in organometallic chemistry and homogeneous catalysis , 1977 .

[2]  C. Adjiman,et al.  Computer-aided molecular design of solvents for accelerated reaction kinetics. , 2013, Nature chemistry.

[3]  William H. Green,et al.  Using Machine Learning To Predict Suitable Conditions for Organic Reactions , 2018, ACS central science.

[4]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[5]  C. Alsenoy,et al.  Condensed Fukui Functions Derived from Stockholder Charges: Assessment of Their Performance as Local Reactivity Descriptors , 2002 .

[6]  D. Vigo,et al.  Seeking for Selectivity and Efficiency: New Approaches in the Synthesis of Raltegravir , 2020 .

[7]  Kristian Kříž,et al.  Reparametrization of the COSMO Solvent Model for Semiempirical Methods PM6 and PM7 , 2019, J. Chem. Inf. Model..

[8]  F. J. Luque,et al.  Evolution of a multicomponent system: computational and mechanistic studies on the chemo- and stereoselectivity of a divergent process. , 2013, Chemistry.

[9]  Herbert Mayr,et al.  Scales of Nucleophilicity and Electrophilicity: A System for Ordering Polar Organic and Organometallic Reactions , 1994 .

[10]  Analise C. Doney,et al.  Noncovalent Interactions in Organocatalysis and the Prospect of Computational Catalyst Design. , 2016, Accounts of chemical research.

[11]  Bowen Liu,et al.  Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models , 2017, ACS central science.

[12]  Q. Guo,et al.  First-principles prediction of nucleophilicity parameters for pi nucleophiles: implications for mechanistic origin of Mayr's equation. , 2010, Chemistry.

[13]  Tian Lu,et al.  Multiwfn: A multifunctional wavefunction analyzer , 2012, J. Comput. Chem..

[14]  P. Phukan,et al.  DFT analysis of the nucleophilicity of substituted pyridines and prediction of new molecules having nucleophilic character stronger than 4-pyrrolidino pyridine , 2016, Journal of Chemical Sciences.

[15]  T. Ramasami,et al.  Chemical reactivity and selectivity using Fukui functions: basis set and population scheme dependence in the framework of B3LYP theory , 2002 .

[16]  David R. J. Hose,et al.  Toward a More Holistic Framework for Solvent Selection , 2016 .

[17]  Jan H. Jensen,et al.  Improving solvation energy predictions using the SMD solvation method and semiempirical electronic structure methods. , 2018, The Journal of chemical physics.

[18]  K N Houk,et al.  Origins of opposite absolute stereoselectivities in proline-catalyzed direct Mannich and aldol reactions. , 2003, Organic letters.

[19]  Jun Xu,et al.  Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks , 2019, J. Chem. Inf. Model..

[20]  H. Chermette,et al.  Reactivity Indices in Density Functional Theory: A New Evaluation of the Condensed Fukui Function by Numerical Integration , 1998 .

[21]  M. Orlandi,et al.  Nucleophilicity Prediction via Multivariate Linear Regression Analysis , 2021, The Journal of organic chemistry.

[22]  Stefan Grimme,et al.  Effect of the damping function in dispersion corrected density functional theory , 2011, J. Comput. Chem..

[23]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[24]  Samuel Boobier,et al.  Machine learning with physicochemical relationships: solubility prediction in organic solvents and water , 2020, Nature Communications.