A Quantum Mechanical/Neural Net Model for Boiling Points with Error Estimation

We present QSPR models for normal boiling points employing a neural network approach and descriptors calculated using semiempirical MO theory (AM1 and PM3). These models are based on a data set of 6000 compounds with widely varying functionality and should therefore be applicable to a diverse range of systems. We include cross-validation by simultaneously training 10 different networks, each with different training and test sets. The predicted boiling point is given by the mean of the 10 results, and the individual error of each compound is related to the standard deviation of these predictions. For our best model we find that the standard deviation of the training error is 16.5 K for 6000 compounds and the correlation coefficient (R2) between our prediction and experiment is 0.96. We also examine the effect of different conformations and tautomerism on our calculated results. Large deviations between our predictions and experiment can generally be explained by experimental errors or problems with the semiempirical methods.

[1]  Peter C. Jurs,et al.  Prediction of boiling points and critical temperatures of industrially important organic compounds from molecular structure , 1994, J. Chem. Inf. Comput. Sci..

[2]  Iñaki Tuñón,et al.  GEPOL: An improved description of molecular surfaces. III. A new algorithm for the computation of a solvent‐excluding surface , 1994, J. Comput. Chem..

[3]  A. Y. Meyer The size of molecules , 1987 .

[4]  G. Schürer,et al.  Accurate parametrized variational calculations of the molecular electronic polarizability by NDDO‐based methods , 1999 .

[5]  Eamonn F. Healy,et al.  Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model , 1985 .

[6]  M. Rami Reddy,et al.  Assessment of methods used for predicting lipophilicity: Application to nucleosides and nucleoside bases , 1993, J. Comput. Chem..

[7]  John Homer,et al.  Artificial neural networks for the prediction of liquid viscosity, density, heat of vaporization, boiling point and Pitzer's acentric factor Part I. Hydrocarbons , 1999 .

[8]  Peter C. Jurs,et al.  Prediction of Normal Boiling Points of Hydrocarbons from Molecular Structure , 1995, J. Chem. Inf. Comput. Sci..

[9]  R. Gani,et al.  New group contribution method for estimating properties of pure compounds , 1994 .

[10]  Zhiliang Li,et al.  Approach to Estimation and Prediction for Normal Boiling Point (NBP) of Alkanes Based on a Novel Molecular Distance-Edge (MDE) Vector , 1998, J. Chem. Inf. Comput. Sci..

[11]  J. Murray,et al.  Relationships of critical constants and boiling points to computed molecular surface properties , 1993 .

[12]  Bernd Beck,et al.  The natural atomic orbital point charge model for PM3: Multipole moments and molecular electrostatic potentials , 1994, J. Comput. Chem..

[13]  Ralph Kühne,et al.  Group contribution methods to estimate water solubility of organic chemicals , 1995 .

[14]  T. Clark,et al.  Some Biological Applications of Semiempirical MO Theory , 2002 .

[15]  Lowell H. Hall,et al.  Boiling Point and Critical Temperature of a Heterogeneous Data Set: QSAR with Atom Type Electrotopological State Indices Using Artificial Neural Networks , 1996, J. Chem. Inf. Comput. Sci..

[16]  Shaomeng Wang,et al.  Computer Automated log P Calculations Based on an Extended Group Contribution Approach , 1994, J. Chem. Inf. Comput. Sci..

[17]  Timothy Clark,et al.  Multicenter point charge model for high‐quality molecular electrostatic potentials from AM1 calculations , 1993, J. Comput. Chem..

[18]  Peter C. Jurs,et al.  Prediction of boiling points of organic heterocyclic compounds using regression and neural network techniques , 1993, J. Chem. Inf. Comput. Sci..

[19]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[20]  J. Stewart Optimization of parameters for semiempirical methods I. Method , 1989 .

[21]  J. Murray,et al.  Statistical analysis of the molecular surface electrostatic potential: an approach to describing noncovalent interactions in condensed phases , 1998 .

[22]  Bernd Beck,et al.  QM/NN QSPR Models with Error Estimation: Vapor Pressure and LogP , 2000, J. Chem. Inf. Comput. Sci..

[23]  Stephen E. Stein,et al.  Estimation of normal boiling points from group contributions , 1994, J. Chem. Inf. Comput. Sci..

[24]  M. Karelson,et al.  Correlation of Boiling Points with Molecular Structure. 1. A Training Set of 298 Diverse Organics and a Test Set of 9 Simple Inorganics , 1996 .

[25]  David T. Stanton,et al.  Computer-assisted prediction of normal boiling points of pyrans and pyrroles , 1992, J. Chem. Inf. Comput. Sci..