A QSPR study on the GC retention times of a series of fatty, dicarboxylic and amino acids by MLR and ANN.

Quantitative structure-property relationship (QSPR) analysis has been carried out to a series of fatty, amino and dicarboxylic acids to model their GC retention times. A genetic partial least square method (GAPLS) was applied as a variable selection tool. Modeling of retention times of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR) and artificial neural network (ANN). The neural network employed here is a connected back-propagation system with a 3-4-1 architecture. Three topological indices for these compounds, namely, mean information index on atomic composition (AAC), average connectivity index chi-0 (X0A) and total information index of atomic composition (IAC) taken as inputs for the regression models. The results indicate that the GA is a very effective variable selection approach for QSPR analysis. The comparison of the two regression methods used showed that ANN has better prediction ability than MLR. The statistical figure of merits of the two models showed the successful modeling of the retention times with molecular descriptors.

[1]  Bhupinder S. Dayal,et al.  Improved PLS algorithms , 1997 .

[2]  R. Leardi Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection , 1994 .

[3]  M. C. Gennaro,et al.  Neural network and experimental design to investigate the effect of five factors in ion-interaction high-performance liquid chromatography , 1998 .

[4]  M. C. Bruzzoniti,et al.  Comparison of prediction power between theoretical and neural-network models in ion-interaction chromatography , 1998 .

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  I. Revelsky,et al.  Simultaneous determination of fatty, dicarboxylic and amino acids based on derivatization with isobutyl chloroformate followed by gas chromatography-positive ion chemical ionization mass spectrometry. , 2004, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[7]  Riccardo Leardi,et al.  Application of genetic algorithm–PLS for feature selection in spectral data sets , 2000 .

[8]  Xiwen He,et al.  Prediction of the selectivity coefficients of a berberine selective electrode using artificial neural networks , 1997 .

[9]  M. Freund,et al.  Artificial neural network processing of stripping analysis responses for identifying and quantifying heavy metals in the presence of intermetallic compound formation. , 1997, Analytical chemistry.

[10]  H Ichikawa,et al.  Neural networks applied to quantitative structure-activity relationship analysis. , 1990, Journal of medicinal chemistry.

[11]  Hugo Kubinyi,et al.  Evolutionary variable selection in regression and PLS analyses , 1996 .

[12]  R. H. Myers Classical and modern regression with applications , 1986 .

[13]  Jahan B. Ghasemi,et al.  Prediction of gas chromatography/electron capture detector retention times of chlorinated pesticides, herbicides, and organohalides by multivariate chemometrics methods. , 2007, Analytica chimica acta.

[14]  P. Broto,et al.  Molecular structures: perception, autocorrelation descriptor and sar studies. Autocorrelation descriptor , 1984 .

[15]  R. Leardi,et al.  Genetic algorithms applied to feature selection in PLS regression: how and when to use them , 1998 .

[16]  Lawrence S. Anker,et al.  Prediciton of carbon-13 nuclear magnetic resonance chemical shifts by artificial neural networks , 1992 .

[17]  H. Metting,et al.  Neural networks in high-performance liquid chromatography optimization: response surface modeling. , 1996, Journal of chromatography. A.

[18]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[19]  Jahan B. Ghasemi,et al.  Development of a model to predict partition coefficient of organic pollutants in cloud point extraction process. , 2006, Annali di chimica.

[20]  Anton J. Hopfinger,et al.  Application of Genetic Function Approximation to Quantitative Structure-Activity Relationships and Quantitative Structure-Property Relationships , 1994, J. Chem. Inf. Comput. Sci..

[21]  R. Boggia,et al.  Genetic algorithms as a strategy for feature selection , 1992 .

[22]  Keith L. Peterson,et al.  Counter-propagation neural networks in the modeling and prediction of Kovats indexes for substituted phenols , 1992 .

[23]  Steven D. Brown,et al.  QSPR study for estimation of acidity constants of some aromatic acids derivatives using multiple linear regression (MLR) analysis , 2007 .

[24]  Jahan B. Ghasemi,et al.  Combination of genetic algorithm and partial least squares for cloud point prediction of nonionic surfactants from molecular structures. , 2007, Annali di chimica.

[25]  M. Gevrey,et al.  Review and comparison of methods to study the contribution of variables in artificial neural network models , 2003 .

[26]  Hxugo Kubiny Variable Selection in QSAR Studies. I. An Evolutionary Algorithm , 1994 .

[27]  Kimito Funatsu,et al.  GA Strategy for Variable Selection in QSAR Studies: GA-Based PLS Analysis of Calcium Channel Antagonists , 1997, J. Chem. Inf. Comput. Sci..

[28]  K. Roy,et al.  QSAR by LFER model of cytotoxicity data of anti-HIV 5-phenyl-1-phenylamino-1H-imidazole derivatives using principal component factor analysis and genetic function approximation. , 2005, Bioorganic & medicinal chemistry.

[29]  Ekaterina Gordeeva,et al.  Traditional topological indexes vs electronic, geometrical, and combined molecular descriptors in QSAR/QSPR research , 1993, J. Chem. Inf. Comput. Sci..

[30]  Jahan B. Ghasemi,et al.  QSPR prediction of aqueous solubility of drug-like organic compounds. , 2007, Chemical & pharmaceutical bulletin.

[31]  M. Marina,et al.  Neural networks as a tool for modelling the retention behaviour of dihydropyridines in micellar liquid chromatography , 1997 .