Development of a fast and accurate method of 13 C NMR chemical shift prediction

abstract Article history:Received 20 April 2008Received in revised form 25 December 2008Accepted 29 January 2009Available online 11 February 2009Keywords:Computer-assisted structure elucidationNMR chemical shift calculationPLS regression In this article we describe a fast and accurate method of 13 C NMR chemical shift prediction. The high speed ofchemical shift calculation described is achieved using a simple structure description scheme based onindividual atoms rather than functional groups. The systematic choice of an appropriate encoding schemeand the usage of partial least squares regression on a large training set has resulted in a robust and fastalgorithm. The approach provides accuracy comparable with other well known approaches but demonstratesaccelerated calculation speeds of up to a thousand times faster.© 2009 Elsevier B.V. All rights reserved. 1. IntroductionIn various fields of chemistry such as the investigation of naturalproducts or the design of new compounds, scientists often need toeither determine de novo the structure of an unknown or newcompoundor toverifya hypotheticalchemicalstructure.Thisprocess,known as Structure Elucidation, is based on the analysis of availablespectral data. Nuclear Magnetic Resonance (NMR) spectroscopy iscertainly one of the main analytical methods applied to thesechallenges and is a powerful technique for acquiring highly informa-tive spectra associated with a structure.Nuclear magnetic resonance (NMR) is a physical phenomenonbased upon the quantum mechanical magnetic properties of an atom'snucleus. Magnetic nuclei, like

[1]  M. R. Islami,et al.  SYNTHESIS OF DIALKYL 2-(1-CYANO-2-OXO-1-PHENYL-ALKYL)-3-(TRIPHENYL-λ5-PHOSPHANYLIDENE)-SUCCINATES , 2004 .

[2]  Robert C. Glen,et al.  Predicting pKa by Molecular Tree Structured Fingerprints and PLS , 2003, J. Chem. Inf. Comput. Sci..

[3]  A minicomputer program based on additivity rules for the estimation of 13c-nmr chemical shifts , 1977 .

[4]  Antony J. Williams,et al.  Structure Elucidator: A Versatile Expert System for Molecular Structure Elucidation from 1D and 2D NMR Data and Molecular Fragments , 2004, J. Chem. Inf. Model..

[5]  Renate Bürgin Schaller,et al.  A computer program for the automatic estimation of 1H NMR chemical shifts , 1994 .

[6]  Robert C. Glen,et al.  Novel Methods for the Prediction of logP, pKa, and logD , 2002, J. Chem. Inf. Comput. Sci..

[7]  Antony J. Williams,et al.  Toward More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comparison of Neural-Network and Least-Squares Regression Based Approaches , 2008, J. Chem. Inf. Model..

[8]  D. Grant,et al.  Carbon-13 Magnetic Resonance. II. Chemical Shift Data for the Alkanes , 1964 .

[9]  Antony J. Williams,et al.  Computer-assisted structure verification and elucidation tools in NMR-based structure elucidation , 2008 .

[10]  E. Pretsch,et al.  New parameters for predicting 1H NMR chemical shifts of protons attached to carbon atoms , 1995 .

[11]  Peter C. Jurs,et al.  Simulation of the 13C Nuclear Magnetic Resonance Spectra of Ribonucleosides Using Multiple Linear Regression Analysis and Neural Networks , 1996, J. Chem. Inf. Comput. Sci..

[12]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[13]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[14]  Investigation on quantitative relationship between chemical shift of carbon‐13 nuclear magnetic resonance spectra and molecular topological structure based on a novel atomic distance–edge vector (ADEV) , 2001 .

[15]  Christoph Steinbeck,et al.  NMRShiftDB-Constructing a Free Chemical Information System with Open-Source Components , 2003, J. Chem. Inf. Comput. Sci..

[16]  Morton E. Munk,et al.  C13Shift: a computer program for the prediction of carbon-13 NMR spectra based on an open set of additivity rules , 1992, J. Chem. Inf. Comput. Sci..

[17]  T. Gallagher,et al.  S-alkyl dithioformates as 1,3-dipolarophiles. Generation of C(2)-unsubstituted penems. , 2004, Organic letters.

[18]  Jiri Pospichal,et al.  Application of recurrent neural networks in chemistry. Prediction and classification of carbon-13 NMR chemical shifts in a series of monosubstituted benzenes , 1992, J. Chem. Inf. Comput. Sci..

[19]  J. Meiler,et al.  Using neural networks for (13)c NMR chemical shift prediction-comparison with traditional methods. , 2002, Journal of magnetic resonance.

[20]  E. Pretsch,et al.  A computer program for the prediction of 13-C-NMR chemical shifts of organic compounds , 1990 .

[21]  Daniel Cabrol-Bass,et al.  13C NMR Chemical Shift Prediction of sp2 Carbon Atoms in Acyclic Alkenes Using Neural Networks , 1996, J. Chem. Inf. Comput. Sci..

[22]  M. Badertscher,et al.  C-13 Shift: A Computer Program for the Prediction of 13C NMR Spectra Based on an Open Set of Additivity Rules. , 1992 .

[23]  W. Bremser Hose — a novel substructure code , 1978 .

[24]  St. Thomas,et al.  Computer Application of an Incremental System for Calculating 13C NMR Spectra of Aromatic Compounds , 1994, J. Chem. Inf. Comput. Sci..

[25]  Alessandro Bagno,et al.  Computational NMR spectroscopy: reversing the information flow , 2007 .

[26]  Jaspreet Kaur,et al.  An approach to predict the 13C NMR chemical shifts of acrylonitrile copolymers using artificial neural network , 2007 .

[27]  Antony J. Williams,et al.  Are Deterministic Expert Systems for Computer-Assisted Structure Elucidation Obsolete? , 2006, J. Chem. Inf. Model..

[28]  Antony J. Williams,et al.  Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream , 2009, J. Cheminformatics.

[29]  W. Robien DAS CSEARCH-NMR-DATENBANKSYSTEM , 1998 .

[30]  G. W. Small,et al.  Simulation of carbon-13 nuclear magnetic resonance spectra of polycyclic aromatic compounds , 1991 .

[31]  D. Cabrol-Bass,et al.  Structure validation in computer-supported structure elucidation: 13C NMR shift predictions for steroids , 2003 .

[32]  Wojtek J. Krzanowski,et al.  Cross-Validation in Principal Component Analysis , 1987 .

[33]  Jens Meiler,et al.  Fast Determination of 13C NMR Chemical Shifts Using Artificial Neural Networks , 2000, J. Chem. Inf. Comput. Sci..

[34]  Nenad Trinajstic,et al.  Nonlinear Multivariate Regression Outperforms Several Concisely Designed Neural Networks on Three QSPR Data Sets , 2000, J. Chem. Inf. Comput. Sci..

[35]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .