Current Mathematical Methods Used in QSAR/QSPR Studies

This paper gives an overview of the mathematical methods currently used in quantitative structure-activity/property relationship (QASR/QSPR) studies. Recently, the mathematical methods applied to the regression of QASR/QSPR models are developing very fast, and new methods, such as Gene Expression Programming (GEP), Project Pursuit Regression (PPR) and Local Lazy Regression (LLR) have appeared on the QASR/QSPR stage. At the same time, the earlier methods, including Multiple Linear Regression (MLR), Partial Least Squares (PLS), Neural Networks (NN), Support Vector Machine (SVM) and so on, are being upgraded to improve their performance in QASR/QSPR studies. These new and upgraded methods and algorithms are described in detail, and their advantages and disadvantages are evaluated and discussed, to show their application potential in QASR/QSPR studies in the future.

[1]  T. A. Bancroft,et al.  Research papers in statistics , 1966 .

[2]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[3]  Herman Wold,et al.  Systems under indirect observation : causality, structure, prediction , 1982 .

[4]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[5]  Anton J. Hopfinger,et al.  Application of Genetic Function Approximation to Quantitative Structure-Activity Relationships and Quantitative Structure-Property Relationships , 1994, J. Chem. Inf. Comput. Sci..

[6]  V. Vapnik The Support Vector Method of Function Estimation , 1998 .

[7]  S. Wold,et al.  Orthogonal signal correction of near-infrared spectra , 1998 .

[8]  J. Luco,et al.  Isolation and purification of cysteine peptidases from the latex of Araujia hortorum fruits: Study of their esterase activities using partial least-squares (PLS) modeling , 2001 .

[9]  J N Weinstein,et al.  Quantitative structure-antitumor activity relationships of camptothecin analogues: cluster analysis and genetic algorithm-based studies. , 2001, Journal of medicinal chemistry.

[10]  Wenjian Wang,et al.  Determination of the spread parameter in the Gaussian kernel for classification and regression , 2003, Neurocomputing.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[13]  Ruisheng Zhang,et al.  Prediction of the tissue/blood partition coefficients of organic compounds based on the molecular structure using least-squares support vector machines , 2005, J. Comput. Aided Mol. Des..

[14]  Channa K. Hattotuwagama,et al.  Statistical deconvolution of enthalpic energetic contributions to MHC-peptide binding affinity , 2006, BMC Structural Biology.

[15]  Prediction of binding rate of drug to human plasma protein based on heuristic method and support vector machine , 2006 .

[16]  Using SZOTT Descriptors for the Development of QSAMs of Peptides , 2006 .

[17]  Xiaoyun Zhang,et al.  Prediction of standard Gibbs energies of the transfer of peptide anions from aqueous solution to nitrobenzene based on support vector machine and the heuristic method , 2006, Journal of computer-aided molecular design.

[18]  B. Fan,et al.  QSAR Study of Polychlorinated Dibenzodioxins, Dibenzofurans, and Biphenyls using the Heuristic Method and Support Vector Machine , 2006 .

[19]  Tao Wang,et al.  QSAR study of 1,4-dihydropyridine calcium channel antagonists based on gene expression programming. , 2006, Bioorganic & medicinal chemistry.

[20]  Igor V. Tetko,et al.  Benchmarking of Linear and Nonlinear Approaches for Quantitative Structure-Property Relationship Studies of Metal Complexation with Ionophores , 2006, J. Chem. Inf. Model..

[21]  Ting Chen,et al.  Local Lazy Regression: Making Use of the Neighborhood to Improve QSAR Predictions , 2006, J. Chem. Inf. Model..

[22]  Topological QSAR modeling of cytotoxicity data of anti-HIV 5-phenyl-l-phenylamino-imidazole derivatives using GFA, G/PLS, FA and PCRA techniques , 2006 .

[23]  J. Fisz Combined genetic algorithm and multiple linear regression (GA-MLR) optimizer: Application to multi-exponential fluorescence decay surface. , 2006, The journal of physical chemistry. A.

[25]  X. Y. Zhang,et al.  Application of support vector machine (SVM) for prediction toxic activity of different data sets. , 2006, Toxicology.

[26]  K. Roy,et al.  QSTR with Extended Topochemical Atom (ETA) Indices 8.a QSAR for the inhibition of substituted phenols on germination rate of Cucumis sativus using chemometric tools , 2006 .

[27]  Zhide Hu,et al.  Accurate quantitative structure-property relationship model of mobilities of peptides in capillary zone electrophoresis. , 2006, The Analyst.

[28]  Artificial neural network modelling of phytoestrogen binding to estrogen receptors , 2006 .

[29]  Zhide Hu,et al.  Quantitative structure-activity relationship models for prediction of sensory irritants (logRD50) of volatile organic chemicals. , 2006, Chemosphere.

[30]  Z R Li,et al.  Quantitative structure-pharmacokinetic relationships for drug clearance by using statistical learning methods. , 2006, Journal of molecular graphics & modelling.

[31]  Zhide Hu,et al.  QSPR Study of Fluorescence Wavelengths (λex/λem) Based on the Heuristic Method and Radial Basis Function Neural Networks , 2006 .

[32]  K. Roy,et al.  Comparative QSAR modeling of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas. , 2006, Bioorganic & medicinal chemistry letters.

[33]  F. Gharagheizi,et al.  Prediction of Standard Enthalpy of Formation by a QSPR Model , 2007, International Journal of Molecular Sciences.

[34]  Zhide Hu,et al.  Quantitative Structure-Activity Relationship study on a series of novel ligands binding to central benzodiazepine receptor by using the combination of Heuristic Method and Support Vector Machines , 2007 .

[35]  Feng Luan,et al.  Prediction of retention times for a large set of pesticides or toxicants based on support vector machine and the heuristic method. , 2007, Toxicology letters.

[36]  Paola Gramatica,et al.  In silico screening of estrogen-like chemicals based on different nonlinear classification models. , 2007, Journal of molecular graphics & modelling.

[37]  A NEW ACCURATE NEURAL NETWORK QUANTITATIVE STRUCTURE- PROPERTY RELATIONSHIP FOR PREDICTION OF ? (LOWER CRITICAL SOLUTION TEMPERATURE) OF POLYMER SOLUTIONS , 2007 .

[38]  Zhide Hu,et al.  Quantitative structure activity relationship model for predicting the depletion percentage of skin allergic chemical substances of glutathione. , 2007, Analytica chimica acta.

[39]  Svetlana Ibrić,et al.  Generalized regression neural networks in prediction of drug stability , 2007, The Journal of pharmacy and pharmacology.

[40]  A. A. D’Archivio,et al.  Investigation of retention behaviour of non-steroidal anti-inflammatory drugs in high-performance liquid chromatography by using quantitative structure-retention relationships. , 2007, Analytica chimica acta.

[41]  O. Deeb,et al.  Effect of the electronic and physicochemical parameters on the carcinogenesis activity of some sulfa drugs using QSAR analysis based on genetic-MLR and genetic-PLS. , 2007, Chemosphere.

[42]  Raj Kumar Gupta,et al.  Novel semi-automated methodology for developing highly predictive QSAR models: application for development of QSAR models for insect repellent amides , 2006, Journal of molecular modeling.

[43]  Subset Selection and Docking of Human P2X7 Inhibitors , 2007 .

[44]  Zhide Hu,et al.  Quantitative structure–activity relationship study of acyl ureas as inhibitors of human liver glycogen phosphorylase using least squares support vector machines , 2007 .

[45]  Mati Karelson,et al.  QSPR modeling of hyperpolarizabilities , 2007, Journal of molecular modeling.

[46]  Jahan B. Ghasemi,et al.  Simultaneous determination of dopamine and ascorbic acid by linear sweep voltammetry along with chemometrics using a glassy carbon electrode , 2007 .

[47]  Jungae Tak,et al.  Quantitative structure-activity relationship (QSAR) of tacrine derivatives against acetylcholinesterase (AChE) activity using variable selections. , 2007, Bioorganic & medicinal chemistry letters.

[48]  D. Broadhurst,et al.  Soil differentiation using fingerprint Fourier transform infrared spectroscopy, chemometrics and genetic algorithm-based feature selection , 2007 .

[49]  Uko Maran,et al.  Modeling the Toxicity of Chemicals to Tetrahymena pyriformis Using Heuristic Multilinear Regression and Heuristic Back-Propagation Neural Networks , 2007, J. Chem. Inf. Model..

[50]  Zhide Hu,et al.  QSAR method for prediction of protein-peptide binding affinity: application to MHC class I molecule HLA-A*0201. , 2007, Journal of molecular graphics & modelling.

[51]  G Narahari Sastry,et al.  Molecular modeling studies of pyridopurinone derivatives--potential phosphodiesterase 5 inhibitors. , 2007, Journal of molecular graphics & modelling.

[52]  Eric J. Martin,et al.  Conformational Sampling of Bioactive Molecules: A Comparative Study , 2007, J. Chem. Inf. Model..

[53]  Farhad Gharagheizi,et al.  QSPR analysis for intrinsic viscosity of polymer solutions by means of GA-MLR and RBFNN , 2007 .

[54]  Huanxiang Liu,et al.  An accurate QSRR model for the prediction of the GC×GC–TOFMS retention time of polychlorinated biphenyl (PCB) congeners , 2007, Analytical and bioanalytical chemistry.

[55]  Mancang Liu,et al.  Prediction of ozone tropospheric degradation rate constants by projection pursuit regression. , 2007, Analytica chimica acta.

[56]  Prediction of binding affinities to beta1 isoform of human thyroid hormone receptor by genetic algorithm and projection pursuit regression. , 2007, Bioorganic & medicinal chemistry letters.

[57]  B. T. Fan,et al.  QSAR model for prediction capacity factor of molecular imprinting polymer based on gene expression programming , 2007 .

[58]  K. Siamopoulos,et al.  Evaluation of tubulointerstitial lesions' severity in patients with glomerulonephritides: an NMR-based metabonomic study. , 2007, Journal of proteome research.

[59]  H. Si,et al.  Prediction of atmospheric degradation data for POPs by gene expression programming , 2008, SAR and QSAR in environmental research.

[60]  Weerasak Samee,et al.  3D-QSAR Investigation of Synthetic Antioxidant Chromone Derivatives by Molecular Field Analysis , 2008, International journal of molecular sciences.

[61]  Xiaoyun Zhang,et al.  Quantitative structure-activity relationship studies of a series of non-benzodiazepine structural ligands binding to benzodiazepine receptor. , 2008, European journal of medicinal chemistry.

[62]  Kunal Roy,et al.  Exploring molecular shape analysis of styrylquinoline derivatives as HIV-1 integrase inhibitors. , 2008, European journal of medicinal chemistry.

[63]  M. Goodarzi,et al.  Orthogonal signal correction-partial least squares method for simultaneous spectrophotometric determination of cypermethrin and tetramethrin. , 2008, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[64]  A. Niazi,et al.  Simultaneous Voltammetric Determination of Lead and Tin by Adsorptive Differential Pulse Stripping Method and Orthogonal Signal Correction-Partial Least Squares in Water Samples , 2008 .

[65]  Zhiguo Gong,et al.  Study of Nematic Transition Temperatures in Themotropic Liquid Crystal Using Heuristic Method and Radial Basis Function Neural Networks and Support Vector Machine , 2008 .

[66]  Zhide Hu,et al.  Quantitative Structure-Retention relationship study of the constituents of saffron aroma in SPME-GC-MS based on the projection pursuit regression method. , 2008, Talanta.

[67]  Zhong Cheng,et al.  [Quantitative analysis of electronic absorption spectroscopy by piecewise orthogonal signal correction and partial least square]. , 2008, Guang pu xue yu guang pu fen xi = Guang pu.

[68]  O. Silakari,et al.  Exploring three-dimensional quantitative structural activity relationship (3D-QSAR) analysis of SCH 66336 (Sarasar) analogues of farnesyltransferase inhibitors. , 2008, European journal of medicinal chemistry.

[69]  O. Silakari,et al.  3D-QSAR Studies of Various Diaryl Urea Derivatives of Multi-targeted Receptor Tyrosine Kinase Inhibitors: Molecular Field Analysis Approach , 2008 .

[70]  Ruisheng Zhang,et al.  Prediction of Volatile Components Retention Time in Blackstrap Molasses by Least‐Squares Support Vector Machine , 2008 .

[71]  Mehdi Mehrpooya,et al.  Prediction of some important physical properties of sulfur compounds using quantitative structure–properties relationships , 2008, Molecular Diversity.

[72]  A. Niazi,et al.  Prediction of toxicity of nitrobenzenes using ab initio and least squares support vector machines. , 2008, Journal of hazardous materials.

[73]  Seoung Bum Kim,et al.  Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra , 2008, Int. J. Data Min. Bioinform..

[74]  Hai-Feng Chen,et al.  Quantitative predictions of gas chromatography retention indexes with support vector machines, radial basis neural networks and multiple linear regression. , 2008, Analytica chimica acta.

[75]  Ryszard Tadeusiewicz,et al.  How to select an optimal neural model of chemical reactivity? , 2008, Neurocomputing.

[76]  Xiaoyun Zhang,et al.  Quantitative structure-activity relationship modeling of triaminotriazine drugs based on Heuristic Method , 2008 .

[77]  Joseph Rebehmed,et al.  2D and 3D QSAR studies of diarylpyrimidine HIV-1 reverse transcriptase inhibitors , 2008, J. Comput. Aided Mol. Des..

[78]  Jahan B. Ghasemi,et al.  Kinetic spectrophotometric determination of trace amounts of palladium by whole kinetic curve and a fixed time method using resazurine sulfide reaction. , 2008, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[79]  M. Karimi,et al.  Individual and simultaneous determinations of phenothiazine drugs using PCR, PLS and (OSC)-PLS multivariate calibration methods , 2008 .

[80]  Zhide Hu,et al.  Prediction of retention indices of drugs based on immobilized artificial membrane chromatography using Projection Pursuit Regression and Local Lazy Regression. , 2008, Journal of separation science.

[81]  Emilio Benfenati,et al.  A new hybrid system of QSAR models for predicting bioconcentration factors (BCF). , 2008, Chemosphere.

[82]  F. Gharagheizi,et al.  Prediction of Flash Point Temperature of Pure Components Using a Quantitative Structure–Property Relationship Model , 2008 .

[83]  M. Ganjali,et al.  Application of Correlation Ranking Procedure and Artificial Neural Networks in the Modeling of Liquid Chromatographic Retention Times (tR) of Various Pesticides , 2008 .

[84]  Liansheng Wang,et al.  QSAR study on estrogenic activity of structurally diverse compounds using generalized regression neural network , 2008 .

[85]  Akash Khandelwal,et al.  Prediction of hERG Potassium Channel Blockade Using kNN-QSAR and Local Lazy Regression Methods , 2008 .

[86]  A. Pramod,et al.  Quantitative Structure Activity Relationship and Pharmacophore Studies of Adenosine Receptor A2B Inhibitors , 2008, Chemical biology & drug design.

[87]  M. Goodarzi,et al.  Prediction of the logarithmic of partition coefficients (log P) of some organic compounds by least square-support vector machine (LS-SVM) , 2008 .

[88]  A. Samadi-Maybodi,et al.  Simultaneous determination of vitamin B12 and its derivatives using some of multivariate calibration 1 (MVC1) techniques. , 2008, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[89]  Xiaoyun Zhang,et al.  Prediction of quantitative calibration factors of some organic compounds in gas chromatography. , 2008, The Analyst.

[90]  F. Gharagheizi,et al.  A Molecular‐Based Model for Prediction of Solubility of C60 Fullerene in Various Solvents , 2008 .

[91]  X. Yao,et al.  QSAR Models for the Dermal Penetration of Polycyclic Aromatic Hydrocarbons Based on Gene Expression Programming , 2008 .

[92]  Peiyuan Yin,et al.  Serum metabolic profiling of abnormal savda by liquid chromatography/mass spectrometry. , 2008, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[93]  F. Gharagheizi A new molecular-based model for prediction of enthalpy of sublimation of pure components , 2008 .

[94]  S. Mackay,et al.  3D‐QSAR Studies on Chromone Derivatives as HIV‐1 Protease Inhibitors: Application of Molecular Field Analysis , 2008, Archiv der Pharmazie.

[95]  Eslam Pourbasheer,et al.  QSRR Study of GC Retention Indices of Essential-Oil Compounds by Multiple Linear Regression with a Genetic Algorithm , 2008 .

[96]  Hongzong Si,et al.  Quantitative structure activity relationship study on EC50 of anti-HIV drugs , 2008 .

[97]  Piecewise Orthogonal Signal Correction Approach and Its Application in the Analysis of Wheat Near-infrared Spectroscopic Data , 2008 .

[98]  F. Gharagheizi,et al.  Prediction of molecular diffusivity of pure components into air: a QSPR approach. , 2008, Chemosphere.

[99]  Zhide Hu,et al.  Novel approaches to predict the retention of histidine‐containing peptides in immobilized metal‐affinity chromatography , 2008, Proteomics.

[100]  Zhide Hu,et al.  Prediction of fungicidal activities of rice blast disease based on least-squares support vector machines and project pursuit regression. , 2008, Journal of agricultural and food chemistry.

[101]  A. Mehdipour,et al.  DFT‐Based QSAR Study of Valproic Acid and its Derivatives , 2008 .

[102]  Xiaoyun Zhang,et al.  Prediction of inhibition of matrix metalloproteinase inhibitors based on the combination of Projection Pursuit Regression and Grid Search method , 2008 .

[103]  Yingying Wen,et al.  Quantitative structure-property relationship study for estimation of quantitative calibration factors of some organic compounds in gas chromatography. , 2008, Analytica chimica acta.

[104]  Farhad Gharagheizi,et al.  Prediction of the Watson Characterization Factor of Hydrocarbon Components from Molecular Properties , 2008 .

[105]  X. Y. Zhang,et al.  QSAR study of neuraminidase inhibitors based on heuristic method and radial basis function network. , 2008, European journal of medicinal chemistry.

[106]  Zhide Hu,et al.  A novel quantitative structure-activity relationship method to predict the affinities of MT3 melatonin binding site. , 2008, European journal of medicinal chemistry.

[107]  [Research on QSPR for n-octanol-water partition coefficients of organic compounds based on genetic algorithms-support vector machine and genetic algorithms-radial basis function neural networks]. , 2008, Huan jing ke xue= Huanjing kexue.

[108]  Farhad Gharagheizi,et al.  QSPR Studies for Solubility Parameter by Means of Genetic Algorithm-Based Multivariate Linear Regression and Generalized Regression Neural Network , 2008 .

[109]  VHSEH Descriptors for the Development of QSAMs of Peptides , 2008 .

[110]  B T Fan,et al.  QSPR analysis of air-to-blood distribution of volatile organic compounds. , 2008, Ecotoxicology and environmental safety.

[111]  O. Silakari,et al.  Three-dimensional quantitative structure-activity relationship (3D-QSAR) studies of various benzodiazepine analogues of γ-secretase inhibitors , 2009, Journal of molecular modeling.

[112]  Kunal Roy,et al.  Predictive QSAR modeling of HIV reverse transcriptase inhibitor TIBO derivatives. , 2009, European journal of medicinal chemistry.

[113]  Prediction of Photolysis of PCDD/Fs Adsorbed to Spruce [Picea abies (L.) Karst.] Needle Surfaces Under Sunlight Irradiation Based on Projection Pursuit Regression , 2009 .

[114]  Jian-rong Gao,et al.  Quantitative Structure–Activity Relationship Analysis of Some Thiourea Derivatives with Activities Against HIV-1 (IIIB) , 2009 .

[115]  B. Fan,et al.  Rapid toxicity prediction of organic chemicals to Chlorella vulgaris using quantitative structure-activity relationships methods. , 2009, Ecotoxicology and environmental safety.

[116]  Farhad Gharagheizi,et al.  Estimation of Aniline Point Temperature of Pure Hydrocarbons: A Quantitative Structure−Property Relationship Approach , 2009 .

[117]  Ruisheng Zhang,et al.  Prediction of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas based on the heuristic method, support vector machine and projection pursuit regression. , 2009, European journal of medicinal chemistry.