Chemical reactivity predictions: Use of data mining techniques for analyzing regioselective azidolysis of epoxides

Azidolysis of epoxides followed by reduction of the intermediate azido alcohols constitutes a valuable synthetic tool for the construction of β‐amino alcohols, an important chemical functionality occurring in many biologically active compounds of natural origin. However, depending on conditions under which the azidolysis is carried out, two regioisomeric products can be formed, as a consequence of the nucleophilic attack on both the oxirane carbon atoms. In this work, predictive models for quantitative structure‐reactivity relationships were developed by means of multiple linear regression, k‐nearest neighbor, locally weighted regression, and Gaussian Process regression algorithms. The specific nature of the problem at hand required the creation of appropriate new descriptors, able to properly reflect the most relevant features of molecular moieties directly involved in the opening process. The models so obtained are able to predict the regioselectivity of the azidolysis of epoxides promoted by sodium azide, in the presence of lithium perchlorate, on the basis of steric hindrance, and charge distribution of the substituents directly attached to the oxirane ring. © 2010 Wiley Periodicals, Inc. J Comput Chem 2010

[1]  S. Weisberg Applied Linear Regression: Weisberg/Applied Linear Regression 3e , 2005 .

[2]  Cristina Gardelli,et al.  Regiochemical control of the ring opening of 1:2-epoxides by means of chelating processes. 10. Synthesis and ring opening reactions of mono- and difunctionalized cis and trans aliphatic oxirane systems , 1995 .

[3]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4]  M. Pineschi,et al.  Regiochemical control of the ring opening of 1,2-Epoxides by means of chelating processes. 11. Ring opening reactions of aliphatic mono- and difunctionalized cis and trans 2,3- and 3,4-Epoxy Esters , 1995 .

[5]  F. Ochsenbein,et al.  The VizieR database of astronomical catalogues , 2000, astro-ph/0002122.

[6]  Nina Nikolova-Jeliazkova,et al.  QSAR Applicability Domain Estimation by Projection of the Training Set in Descriptor Space: A Review , 2005, Alternatives to laboratory animals : ATLA.

[7]  Chris Sander Bioinformatics - Challenges in 2001 , 2001, Bioinform..

[8]  Tommi S. Jaakkola,et al.  Fast optimal leaf ordering for hierarchical clustering , 2001, ISMB.

[9]  Malik Beshir Malik,et al.  Applied Linear Regression , 2005, Technometrics.

[10]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[11]  Cristina Gardelli,et al.  REGIOCHEMICAL CONTROL OF THE RING-OPENING OF 1,2-EPOXIDES BY MEANS OF CHELATING PROCESSES .5. SYNTHESIS AND REACTIONS OF SOME 2,3-EPOXY-1-ALKANOL DERIVATIVES , 1993 .

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.