Prediction of sweetness by multilinear regression analysis and support vector machine.

The sweetness of a compound is of large interest for the food additive industry. In this work, 2 quantitative models were built to predict the logSw (the logarithm of sweetness) of 320 unique compounds with a molecular weight from 132 to 1287 and a sweetness from 22 to 22500000. The whole dataset was randomly split into a training set including 214 compounds and a test set including 106 compounds, represented by 12 selected molecular descriptors. Then, logSw was predicted using a multilinear regression (MLR) analysis and a support vector machine (SVM). For the test set, the correlation coefficients of 0.87 and 0.88 were obtained by MLR and SVM, respectively. The descriptors found in our quantitative structure-activity relationship models are prone to a structural interpretation and support the AH/B System model proposed by Shallenberger and Acree.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Johann Gasteiger,et al.  Deriving the 3D structure of organic molecules from their infrared spectra , 1999 .

[3]  Michel Petitjean,et al.  Applications of the radius-diameter diagram to the classification of topological and geometrical shapes of chemical compounds , 1992, J. Chem. Inf. Comput. Sci..

[4]  Aixia Yan,et al.  Prediction of Human Intestinal Absorption by GA Feature Selection and Support Vector Machine Regression , 2008, International journal of molecular sciences.

[5]  C. Hansch,et al.  Dependence of Relative Sweetness on Hydrophobic Bonding , 1966, Nature.

[6]  H. Iwamura,et al.  Structure--taste relationship of perillartine and nitro- and cyanoaniline derivatives. , 1980, Journal of medicinal chemistry.

[7]  J. Gasteiger,et al.  Autocorrelation of Molecular Surface Properties for Modeling Corticosteroid Binding Globulin and Cytosolic Ah Receptor Activity by Neural Networks , 1995 .

[8]  In-silico prediction of sweetness of sugars and sweeteners , 2011 .

[9]  Aixia Yan,et al.  In silico prediction of rhabdomyolysis of compounds by self-organizing map and support vector machine. , 2011, Toxicology in vitro : an international journal published in association with BIBRA.

[10]  U. N. Singh,et al.  Precursor–Product Relationship between Nuclear and Cytoplasmic Ribonucleic Acid , 1966, Nature.

[11]  T. Acree,et al.  Molecular Theory of Sweet Taste , 1967, Nature.

[12]  A. Yan,et al.  In Silico Models to Discriminate Compounds Inducing and Noninducing Toxic Myopathy , 2012, Molecular informatics.

[13]  A. Yan,et al.  Quantitative structure and bioactivity relationship study on human acetylcholinesterase inhibitors. , 2012, Bioorganic & medicinal chemistry letters.

[14]  Kenneth J. Miller,et al.  Additivity methods in molecular polarizability , 1990 .

[15]  Aixia Yan,et al.  Classification of HCV NS5B Polymerase Inhibitors Using Support Vector Machine , 2012, International journal of molecular sciences.

[16]  Johann Gasteiger,et al.  Prediction of Aqueous Solubility of Organic Compounds Based on a 3D Structure Representation , 2003, J. Chem. Inf. Comput. Sci..

[17]  Aixia Yan,et al.  Prediction of biological activity of Aurora-A kinase inhibitors by multilinear regression analysis and support vector machine. , 2011, Bioorganic & medicinal chemistry letters.

[18]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.