Size-independent neural networks based first-principles method for accurate prediction of heat of formation of fuels.

Neural network-based first-principles method for predicting heat of formation (HOF) was previously demonstrated to be able to achieve chemical accuracy in a broad spectrum of target molecules [L. H. Hu et al., J. Chem. Phys. 119, 11501 (2003)]. However, its accuracy deteriorates with the increase in molecular size. A closer inspection reveals a systematic correlation between the prediction error and the molecular size, which appears correctable by further statistical analysis, calling for a more sophisticated machine learning algorithm. Despite the apparent difference between simple and complex molecules, all the essential physical information is already present in a carefully selected set of small molecule representatives. A model that can capture the fundamental physics would be able to predict large and complex molecules from information extracted only from a small molecules database. To this end, a size-independent, multi-step multi-variable linear regression-neural network-B3LYP method is developed in this work, which successfully improves the overall prediction accuracy by training with smaller molecules only. And in particular, the calculation errors for larger molecules are drastically reduced to the same magnitudes as those of the smaller molecules. Specifically, the method is based on a 164-molecule database that consists of molecules made of hydrogen and carbon elements. 4 molecular descriptors were selected to encode molecule's characteristics, among which raw HOF calculated from B3LYP and the molecular size are also included. Upon the size-independent machine learning correction, the mean absolute deviation (MAD) of the B3LYP/6-311+G(3df,2p)-calculated HOF is reduced from 16.58 to 1.43 kcal/mol and from 17.33 to 1.69 kcal/mol for the training and testing sets (small molecules), respectively. Furthermore, the MAD of the testing set (large molecules) is reduced from 28.75 to 1.67 kcal/mol.

[1]  Lihong Hu,et al.  A generalized exchange-correlation functional: the Neural-Networks approach ☆ , 2003, physics/0311024.

[2]  Xin Xu,et al.  Improving the B3LYP bond energies by using the X1 method. , 2008, The Journal of chemical physics.

[3]  Xin Xu,et al.  The X1 method for accurate and efficient prediction of heats of formation. , 2007, The Journal of chemical physics.

[4]  G. Marin,et al.  Ab Initio Calculations for Hydrocarbons: Enthalpy of Formation, Transition State Geometry, and Activation Energy for Radical Reactions , 2003 .

[5]  C. Corminboeuf,et al.  Reaction enthalpies using the neural-network-based X1 approach: the important choice of input descriptors. , 2009, The journal of physical chemistry. A.

[6]  Lihong Hu,et al.  Alternative approach to chemical accuracy: a neural networks-based first-principles method for heat of formation of molecules made of H, C, N, O, F, S, and Cl. , 2014, The journal of physical chemistry. A.

[7]  Jianming Wu,et al.  Improving B3LYP heats of formation with three‐dimensional molecular descriptors , 2016, J. Comput. Chem..

[8]  G. L. Kenyon,et al.  4-Oxalocrotonate tautomerase, an enzyme composed of 62 amino acid residues per monomer. , 1992, The Journal of biological chemistry.

[9]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[10]  J. Behler,et al.  Metadynamics simulations of the high-pressure phases of silicon employing a high-dimensional neural network potential. , 2008, Physical review letters.

[11]  Roman M. Balabin,et al.  Neural network approach to quantum-chemistry data: accurate prediction of density functional theory energies. , 2009, The Journal of chemical physics.

[12]  Robert W. Wilson,et al.  Regressions by Leaps and Bounds , 2000, Technometrics.

[13]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[14]  Min Zhang,et al.  Improving the accuracy of density-functional theory calculation: the genetic algorithm and neural network approach. , 2007, The Journal of chemical physics.

[15]  P. Hohenberg,et al.  Inhomogeneous Electron Gas , 1964 .

[16]  Timothy Clark,et al.  Enthalpies of formation from B3LYP calculations , 2004, J. Comput. Chem..

[17]  GuanHua Chen,et al.  A Combined First-principles Calculation and Neural Networks Correction Approach for Evaluating Gibbs Energy of Formation , 2004 .

[18]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[19]  L. Curtiss,et al.  Assessment of Gaussian-2 and density functional theories for the computation of enthalpies of formation , 1997 .

[20]  Leo Radom,et al.  Trends in R-X bond dissociation energies (R = Me, Et, i-Pr, t-Bu; X = H, CH3, OCH3, OH, F): a surprising shortcoming of density functional theory. , 2005, The journal of physical chemistry. A.

[21]  L. Curtiss,et al.  Intermolecular interactions from a natural bond orbital, donor-acceptor viewpoint , 1988 .

[22]  Clémence Corminboeuf,et al.  Systematic errors in computed alkane energies using B3LYP and other popular DFT functionals. , 2006, Organic letters.

[23]  F. Yao,et al.  Density Functional Method Studies of XH (XC, N, O, Si, P, S) Bond Dissociation Energies , 2005 .

[24]  Lihong Hu,et al.  Combined first-principles calculation and neural-network correction approach for heat of formation , 2003 .

[25]  Hao Huang,et al.  Assessment of Experimental Bond Dissociation Energies Using Composite ab Initio Methods and Evaluation of the Performances of Density Functional Methods in the Calculation of Bond Dissociation Energies , 2003, J. Chem. Inf. Comput. Sci..

[26]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.