2D Quantitative Structure-Property Relationship Study of Mycotoxins by Multiple Linear Regression and Support Vector Machine

In the present work, support vector machines (SVMs) and multiple linear regression (MLR) techniques were used for quantitative structure–property relationship (QSPR) studies of retention time (tR) in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins) based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLR and SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD). The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

[1]  Haralambos Sarimveis,et al.  A novel QSPR model for predicting θ (lower critical solution temperature) in polymer solutions using molecular descriptors , 2006, Journal of molecular modeling.

[2]  C. Beecher,et al.  A Method for the Dereplication of Natural Product Extracts using Electrospray HPLC/MS , 1995 .

[3]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[4]  Paola Gramatica,et al.  Principles of QSAR models validation: internal and external , 2007 .

[5]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[6]  Haralambos Sarimveis,et al.  A novel QSAR model for predicting induction of apoptosis by 4-aryl-4H-chromenes. , 2006, Bioorganic & medicinal chemistry.

[7]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[8]  Steven D. Brown,et al.  QSPR study for estimation of acidity constants of some aromatic acids derivatives using multiple linear regression (MLR) analysis , 2007 .

[9]  Haifeng Chen,et al.  Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression , 2004, J. Chem. Inf. Model..

[10]  G. Shephard Determination of mycotoxins in human foods. , 2008, Chemical Society reviews.

[11]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[12]  F. Gasparrini,et al.  Substituent effects on the enantioselective retention of anti-HIV 5-aryl-delta 2-1,2,4-oxadiazolines on R,R-DACH-DNB chiral stationary phase. , 1996, Chirality.

[13]  A. Tropsha,et al.  Beware of q 2 , 2002 .

[14]  M. Jalali-Heravi,et al.  Use of Quantitative Structure-Property Relationships in Predicting the Krafft Point of Anionic Surfactants # , 2002 .

[15]  P. Roy,et al.  On Some Aspects of Variable Selection for Partial Least Squares Regression Models , 2008 .

[16]  M. Karelson,et al.  QSPR: the correlation and quantitative prediction of chemical and physical properties from structure , 1995 .

[17]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[18]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[19]  P. Mantle Secondary Metabolites of Penicillium and Acremonium , 1987 .

[20]  S. Salminen,et al.  Ability of dairy strains of lactic acid bacteria to bind a common food carcinogen, aflatoxin B1. , 1998, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[21]  Jahan B. Ghasemi,et al.  Prediction of Solubility of Nonionic Solutes in Anionic Micelle (SDS) Using a QSPR Model , 2008 .

[22]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[23]  Kristian Fog Nielsen,et al.  Fungal metabolite screening: database of 474 mycotoxins and fungal metabolites for dereplication by standardised liquid chromatography-UV-mass spectrometry methodology. , 2003, Journal of chromatography. A.

[24]  J. Frisvad,et al.  Chemical Fungal Taxonomy , 1998 .

[25]  A. Leo CALCULATING LOG POCT FROM STRUCTURES , 1993 .

[26]  W. Steinmetz,et al.  3D QSAR study of the toxicity of trichothecene mycotoxins. , 2009, European journal of medicinal chemistry.

[27]  R. Bentley Mycophenolic Acid: a one hundred year odyssey from antibiotic to immunosuppressant. , 2000, Chemical reviews.

[28]  M. Khan,et al.  Molecular Modeling for Generation of Structural and Molecular Electronic Descriptors for QSAR Using Quantum Mechanical Semiempirical and ab initio Methods , 2003 .

[29]  P. Carrupt,et al.  Enantiomeric resolution of sulfoxides on a DACH-DNB chiral stationary phase: A quantitative structure-enantioselective retention relationship (QSERR) study , 1993 .

[30]  Alan T. Bull,et al.  Search and Discovery Strategies for Biotechnology: the Paradigm Shift , 2000, Microbiology and Molecular Biology Reviews.

[31]  Bernhard Guggenheim,et al.  Inhibition of Oral Bacteria by Phenolic Compounds. Part 1. QSAR Analysis using Molecular Connectivity , 1998 .

[32]  N. Magan,et al.  Mycotoxins in food: detection and control. , 2004 .

[33]  C. Baggiani,et al.  A molecular imprinted polymer with recognition properties towards the carcinogenic mycotoxin ochratoxin A , 2001, Bioseparation.

[34]  J. Gloer The chemistry of fungal antagonism and defense , 1995 .

[35]  Jinbo Bi,et al.  Prediction of Protein Retention Times in Anion-Exchange Chromatography Systems Using Support Vector Regression. , 2003 .

[36]  David W. Roberts,et al.  Application of Octanol/Water Partition Coefficients in Surfactant Science: A Quantitative Structure−Property Relationship for Micellization of Anionic Surfactants , 2002 .

[37]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[38]  Scott D. Kahn,et al.  Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships , 2005, Alternatives to laboratory animals : ATLA.

[39]  David G. Corley,et al.  Strategies for Database Dereplication of Natural Products , 1994 .

[40]  Jinbo Bi,et al.  Prediction of Protein Retention Times in Anion-Exchange Chromatography Systems Using Support Vector Regression , 2002, J. Chem. Inf. Comput. Sci..