CORAL: QSPR model of water solubility based on local and global SMILES attributes.

Water solubility is an important characteristic of a chemical in many aspects. However experimental definition of the endpoint for all substances is impossible. In this study quantitative structure-property relationships (QSPRs) for negative logarithm of water solubility-logS (mol L(-1)) are built up for five random splits into the sub-training set (≈55%), the calibration set (≈25%), and the test set (≈20%). Simplified molecular input-line entry system (SMILES) is used as the representation of the molecular structure. Optimal SMILES-based descriptors are calculated by means of the Monte Carlo method using the CORAL software (http://www.insilico.eu/coral). These one-variable models for water solubility are characterized by the following average values of the statistical characteristics: n(sub_train)=725-763; n(calib)=312-343; n(test)=231-261; r(sub_train)(2)=0.9211±0.0028; r(calib)(2)=0.9555±0.0045; r(test)(2)=0.9365±0.0073; s(sub_train)=0.561±0.0086; s(calib)=0.453±0.0209; s(test)=0.520±0.0205. Thus, the reproducibility of statistical quality of suggested models for water solubility confirmed for five various splits.

[1]  I. Gutman,et al.  Relation between second and third geometric–arithmetic indices of trees , 2011 .

[2]  P. Roy,et al.  Exploring the impact of size of training sets for the development of predictive QSAR models , 2008 .

[3]  Ruifeng Liu,et al.  Development of Quantitative Structure—Property Relationship Models for Early ADME Evaluation in Drug Discovery. Part 2. Blood‐Brain Barrier Penetration. , 2002 .

[4]  Jerzy Leszczynski,et al.  CORAL: QSPR models for solubility of [C60] and [C70] fullerene derivatives , 2011, Molecular Diversity.

[5]  E. Castro,et al.  Qsar Carcinogenic Study of Methylated Polycyclic Aromatic Hydrocarbons Based on Topological Descriptors Derived from Distance Matrices and Correlation Weights of Local Graph Invariants , 2001 .

[6]  Tomasz Puzyn,et al.  Global versus local QSPR models for persistent organic pollutants: balancing between predictivity and economy , 2011 .

[7]  Ruifeng Liu,et al.  Development of Quantitative Structure-Property Relationship Models for Early ADME Evaluation in Drug Discovery. 1. Aqueous Solubility , 2001, J. Chem. Inf. Comput. Sci..

[8]  Eduardo A. Castro,et al.  QSAR on aryl-piperazine derivatives with activity on malaria , 2012 .

[9]  Eduardo A. Castro,et al.  QSAR treatment on a new class of triphenylmethyl-containing compounds as potent anticancer agents , 2011 .

[10]  Paola Gramatica,et al.  Prediction of aqueous solubility, vapor pressure and critical micelle concentration for aquatic partitioning of perfluorinated chemicals. , 2011, Environmental science & technology.

[11]  Alan Talevi,et al.  New QSPR study for the prediction of aqueous solubility of drug-like compounds. , 2008, Bioorganic & medicinal chemistry.

[12]  Pablo R Duchowicz,et al.  A comparative QSAR on 1,2,5-thiadiazolidin-3-one 1,1-dioxide compounds as selective inhibitors of human serine proteinases. , 2011, Journal of molecular graphics & modelling.

[13]  Igor V. Tetko,et al.  Estimation of Aqueous Solubility of Chemical Compounds Using E-State Indices , 2001, J. Chem. Inf. Comput. Sci..

[14]  Jarmo Huuskonen,et al.  Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology , 2000, J. Chem. Inf. Comput. Sci..

[15]  E. Benfenati,et al.  QSAR modelling of carcinogenicity by balance of correlations , 2009, Molecular Diversity.

[16]  G Melagraki,et al.  Ligand and structure based virtual screening strategies for hit-finding and optimization of hepatitis C virus (HCV) inhibitors. , 2011, Current medicinal chemistry.

[17]  BioChem Press,et al.  QSPR with TAU Indices: Water Solubility of Diverse Functional Acyclic Compounds # , 2003 .

[18]  Johann Gasteiger,et al.  Prediction of Aqueous Solubility of Organic Compounds Based on a 3D Structure Representation , 2003, J. Chem. Inf. Comput. Sci..

[19]  David Weininger,et al.  SMILES, 3. DEPICT. Graphical depiction of chemical structures , 1990, J. Chem. Inf. Comput. Sci..

[20]  Emilio Benfenati,et al.  Analysis of the co-evolutions of correlations as a tool for QSAR-modeling of carcinogenicity: an unexpected good prediction based on a model that seems untrustworthy , 2010 .

[21]  Emilio Benfenati,et al.  Co-evolutions of correlations for QSAR of toxicity of organometallic and inorganic substances: An unexpected good prediction based on a model that seems untrustworthy , 2011 .

[22]  Eduardo A. Castro,et al.  QSAR Study and Molecular Design of Open-Chain Enaminones as Anticonvulsant Agents , 2011, International journal of molecular sciences.

[23]  BioChem Press,et al.  QSAR Modeling of Mutagenicity Based on Graphs of Atomic Orbitals , 2002 .

[24]  E. Castro,et al.  QSPR Modeling of Lipophilicity by Means of Correlation Weights of Local Graph Invariants , 2003 .

[25]  A. Balaban,et al.  Comparative QSAR , 2022 .

[26]  Andrey A. Toropov,et al.  Modeling of lipophilicity by means of correlation weighting of local graph invariants , 2001 .

[27]  Giuseppina C. Gini,et al.  CORAL: Quantitative structure–activity relationship models for estimating toxicity of organic compounds in rats , 2011, J. Comput. Chem..

[28]  K. Roy,et al.  Further exploring rm2 metrics for validation of QSPR models , 2011 .