QSRR Study of GC Retention Indices of Essential-Oil Compounds by Multiple Linear Regression with a Genetic Algorithm

Quantitative structure–retention relationships (QSRR) for components of the essential oil of the plant Bidens pilosa Linn. var. Radiata were studied to enable prediction of their retention indices (IR). A data set was selected consisting of the retention indices of 44 components of the essential oil with a range of more than 635 units. A suitable set of molecular descriptors was then calculated and the best-fitting descriptors were selected by using stepwise multiple linear regression (SW-MLR) and a genetic algorithm (GA-MLR) the selection of variables. Comparison of the results obtained indicated the superiority of the genetic algorithm over the stepwise multiple regression method for feature selection. The predictive quality of the QSRR models was tested for an external prediction set of nine compounds, randomly chosen from the 44 compounds. One GA-MLR model with five selected descriptors was obtained. This model, with high statistical significance (R2train = 0.977, SE (%) = 2.33, F = 243.275, R2pred = 0.978), could be used to predict the retention indices of the molecules with error <6%.

[1]  Johann Gasteiger,et al.  Prediction of 1H NMR chemical shifts using neural networks. , 2002, Analytical chemistry.

[2]  Shandar Ahmad,et al.  Design and training of a neural network for predicting the solvent accessibility of proteins , 2003, J. Comput. Chem..

[3]  D L Massart,et al.  Multivariate adaptive regression splines (MARS) in chromatographic quantitative structure-retention relationship studies. , 2004, Journal of chromatography. A.

[4]  K. Başer,et al.  Constituents of the essential oil from the hulls of Pistaciavera L , 1995 .

[5]  Shengshi Z. Li,et al.  Estimation and Prediction of Relative Retention Indices of Polychlorinated Naphthalenes in GC with Molecular Electronegativity Distance Vector , 2006 .

[6]  A. Verloop,et al.  Development and Application of New Steric Substituent Parameters in Drug Design , 1976 .

[7]  T. Xuan,et al.  Chemical composition and antioxidant, antibacterial and antifungal activities of the essential oils from Bidens pilosa Linn. var. Radiata , 2008 .

[8]  H. Sharghi,et al.  QSAR Analysis for ADA upon Interaction with a Series of Adenine Derivatives as Inhibitors , 2004, Nucleosides, nucleotides & nucleic acids.

[9]  C. B. Lucasius,et al.  Genetic algorithms in wavelength selection: a comparative study , 1994 .

[10]  Bjørn K. Alsberg,et al.  A new 3D molecular structure representation using quantum topology with application to structure–property relationships , 2000 .

[11]  S. Agatonovic-Kustrin,et al.  Application of the artificial neural network in quantitative structure-gradient elution retention relationship of phenylthiocarbamyl amino acids derivatives. , 2002, Journal of pharmaceutical and biomedical analysis.

[12]  Semi-empirical topological method for prediction of the chromatographic retention of esters , 2002 .

[13]  S. Jaffe,et al.  Prediction of chromatographic retention times for aromatic hydrocarbons , 2006 .

[14]  Han-Hsuan Fu,et al.  Protection from oxidative damage using Bidens pilosa extracts in normal human erythrocytes. , 2006, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[15]  Ranbir Singh,et al.  J. Mol. Struct. (Theochem) , 1996 .

[16]  M. C. Ortiz,et al.  Qualitative and quantitative aspects of the application of genetic algorithm-based variable selection in polarography and stripping voltammetry , 1999 .

[17]  W Vycudilik,et al.  Prediction of gas chromatographic retention indices of a diverse set of toxicologically relevant compounds. , 2004, Journal of chromatography. A.

[18]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[19]  Ramón García-Domenech,et al.  Topological approach to analgesia. , 1994, Journal of chemical information and computer sciences.

[20]  Elena V. Konstantinova,et al.  The Discrimination Ability of Some Topological and Information Distance Indices for Graphs of Unbranched Hexagonal Systems , 1996, J. Chem. Inf. Comput. Sci..

[21]  U Depczynski,et al.  Genetic algorithms applied to the selection of factors in principal component regression , 2000 .

[22]  M. Shamsipur,et al.  Prediction of selectivity coefficients of a theophylline-selective electrode using MLR and ANN. , 2006, Talanta.

[23]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[24]  Y. Kuo,et al.  Metabolite profiling and chemopreventive bioactivity of plant extracts from Bidens pilosa. , 2004, Journal of ethnopharmacology.

[25]  Chris L. Waller,et al.  Development and Validation of a Novel Variable Selection Technique with Application to Multidimensional Quantitative Structure-Activity Relationship Studies , 1999, J. Chem. Inf. Comput. Sci..

[26]  E. Forgács,et al.  Study of the retention parameters of barbituric acid derivatives in reversed phase HPLC by using quantitative structure-retention relationships , 2002 .

[27]  J. van Staden,et al.  Antibacterial activity of South African plants used for medicinal purposes. , 1997, Journal of ethnopharmacology.

[28]  Gerta Rücker,et al.  Counts of all walks as atomic and molecular descriptors , 1993, J. Chem. Inf. Comput. Sci..

[29]  R. Boggia,et al.  Genetic algorithms as a strategy for feature selection , 1992 .

[30]  J. Azay,et al.  Leaf methanol extract of Bidens pilosa prevents and attenuates the hypertension induced by high-fructose diet in Wistar rats. , 2002, Journal of ethnopharmacology.

[31]  J. Miller,et al.  Statistics for Analytical Chemistry , 1993 .

[32]  Jorge Gálvez,et al.  Charge Indexes. New Topological Descriptors , 1994, J. Chem. Inf. Comput. Sci..

[33]  Palagiri Tulasamma,et al.  Quantitative structure and retention relationships for gas chromatographic data: application to alkyl pyridines on apolar and polar phases. , 2006, Journal of molecular graphics & modelling.

[34]  K. Héberger,et al.  Quantitative structure–retention relationships: VI. Thermodynamics of Kováts retention index–boiling point correlations for alkylbenzenes in gas chromatography , 1999 .

[35]  J. Hunger,et al.  Optimization and analysis of force field parameters by combination of genetic algorithms and neural networks , 1999 .

[36]  S. Unger Molecular Connectivity in Structure–activity Analysis , 1987 .

[37]  M. Jalali-Heravi,et al.  Artificial neural network modeling of Kováts retention indices for noncyclic and monocyclic terpenes. , 2001, Journal of chromatography. A.

[38]  R. Kaliszan,et al.  Comparative characteristics of HPLC columns based on quantitative structure-retention relationships (QSRR) and hydrophobic-subtraction model. , 2005, Journal of chromatography. A.

[39]  Ramón Carrasco-Velar,et al.  Quantitative study of the structure-retention index relationship in the imine family. , 2006, Journal of chromatography. A.