Total and Local Quadratic Indices of the Molecular Pseudograph’s Atom Adjacency Matrix: Applications to the Prediction of Physical Properties of Organic Compounds

A novel topological approach for obtaining a family of new molecular descriptors is proposed. In this connection, a vector space E (molecular vector space), whose elements are organic molecules, is defined as a “direct sum” of different ℜi spaces. In this way we can represent molecules having a total of i atoms as elements (vectors) of the vector spaces ℜi (i=1, 2, 3,..., n; where n is number of atoms in the molecule). In these spaces the components of the vectors are atomic properties that characterize each kind of atom in particular. The total quadratic indices are based on the calculation of mathematical quadratic forms. These forms are functions of the k-th power of the molecular pseudograph’s atom adjacency matrix (M). For simplicity, canonical bases are selected as the quadratic forms’ bases. These indices were generalized to “higher analogues” as number sequences. In addition, this paper also introduces a local approach (local invariant) for molecular quadratic indices. This approach is based mainly on the use of a local matrix [Mk(G, FR)]. This local matrix is obtained from the k-th power (Mk(G)) of the atom adjacency matrix M. Mk(G, FR) includes the elements of the fragment of interest and those that are connected with it, through paths of length k. Finally, total (and local) quadratic indices have been used in QSPR studies of four series of organic compounds. The quantitative models found are significant from a statistical point of view and permit a clear interpretation of the studied properties in terms of the structural features of molecules. External prediction series and cross-validation procedures (leave-one-out and leave-group-out) assessed model predictability. The reported method has shown similar results, compared with other topological approaches. The results obtained were the following: a) Seven physical properties of 74 normal and branched alkanes (boiling points, molar volumes, molar refractions, heats of vaporization, critical temperatures, critical pressures and surface tensions) were well modeled (R>0.98, q2>0.95) by the total quadratic indices. The overall MAE of 5-fold cross-validation were of 2.11 oC, 0.53 cm3, 0.032 cm3, 0.32 KJ/mol, 5.34 oC, 0.64 atm, 0.23 dyn/cm for each property, respectively; b) boiling points of 58 alkyl alcohols also were well described by the present approach; in this sense, two QSPR models were obtained; the first one was developed using the complete set of 58 alcohols [R=0.9938, q2=0.986, s=4.006oC, overall MAE of 5-fold cross-validation=3.824 oC] and the second one was developed using 29 compounds as a training set [R=0.9979, q2=0.992, s=2.97 oC, overall MAE of 5-fold cross-validation=2.580 oC] and 29 compounds as a test set [R=0.9938, s=3.17 oC]; c) good relationships were obtained for the boiling points property (using 80 and 26 cycloalkanes in the training and test sets, respectively) using 2 and 5 total quadratic indices: [Training set: R=0.9823 (q2=0.961 and overall MAE of 5-fold cross-validation=6.429 oC) and R=0.9927 (q2=0.977 and overall MAE of 5-fold cross-validation=4.801 oC); Test set: R=0.9726 and R=0.9927] and d) the linear model developed to describe the boiling points of 70 organic compounds containing aromatic rings has shown good statistical features, with a squared correlation coefficient (R2) of 0.981 (s=7.61 oC). Internal validation procedures (q2=0.9763 and overall MAE of 5-fold cross-validation=7.34 oC) allowed the predictability and robustness of the model found to be assessed. The predictive performance of the obtained QSPR model also was tested on an extra set of 20 aromatic organic compounds (R=0.9930 and s=7.8280 oC). The results obtained are valid to establish that these new indices fulfill some of the ideal requirements proposed by Randić for a new molecular descriptor.

[1]  S. Wold,et al.  Statistical Validation of QSAR Results , 1995 .

[2]  L. Hall,et al.  Molecular Structure Description: The Electrotopological State , 1999 .

[3]  Han van de Waterbeemd,et al.  Chapter 37 - Glossary of Terms Used in Computational Drug Design (IUPAC Recommendations 1997) , 1998 .

[4]  Humberto González Díaz,et al.  What Are the Limits of Applicability for Graph Theoretic Descriptors in QSPR/QSAR? Modeling Dipole Moments of Aromatic Compounds with TOPS-MODE Descriptors , 2003, J. Chem. Inf. Comput. Sci..

[5]  A. Balaban,et al.  Topological Indices and Related Descriptors in QSAR and QSPR , 2003 .

[6]  Ernesto Estrada,et al.  Spectral Moments of the Edge Adjacency Matrix in Molecular Graphs. 3. Molecules Containing Cycles , 1998, J. Chem. Inf. Comput. Sci..

[7]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[8]  A. Balaban Highly discriminating distance-based topological index , 1982 .

[9]  Edwin F. Hilinski,et al.  Mathematical and computational concepts in chemistry : Edited by Nenad Trinajsti, Ellis Horwood, Chichester, 1986. ISBN 0-85312-934-7, 365 pages plus index , 1987, Comput. Chem..

[10]  A. Tropsha,et al.  Beware of q 2 , 2002 .

[11]  Ivan Gutman,et al.  QSPR/QSAR Studies by Molecular Descriptors By Mircea V. Diudea. Nova Science Publishers 2001, ISBN 1-5672-859-0 , 2003, J. Chem. Inf. Comput. Sci..

[12]  Diudea Natural Compounds with Bronchodilator Activity Selected by Molecular Topology , 2002 .

[13]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[14]  R. B. Alzina,et al.  Introducción conceptual al análisis multivariable: un enfoque informático con los paquetes SPSS-X, BMDP, LISREL y SPAD , 1989 .

[15]  E. Castro,et al.  Qsar Carcinogenic Study of Methylated Polycyclic Aromatic Hydrocarbons Based on Topological Descriptors Derived from Distance Matrices and Correlation Weights of Local Graph Invariants , 2001 .

[16]  Zlatko Mihalić,et al.  A graph-theoretical approach to structure-property relationships , 1992 .

[17]  Lemont B. Kier,et al.  Modeling Blood-Brain Barrier Partitioning Using the Electrotopological State , 2002, J. Chem. Inf. Comput. Sci..

[18]  Alexandru T. Balaban,et al.  Topological and Stereochemical Molecular Descriptors for Databases Useful in QSAR, Similarity/Dissimilarity and Drug Design , 1998 .

[19]  Milan Randić,et al.  Correlation of enthalphy of octanes with orthogonal connectivity indices , 1991 .

[20]  Ernesto Estrada,et al.  Extended Wiener indices. A new set of descriptors for quantitative structure-property studies , 1998 .

[21]  Ekaterina Gordeeva,et al.  Traditional topological indexes vs electronic, geometrical, and combined molecular descriptors in QSAR/QSPR research , 1993, J. Chem. Inf. Comput. Sci..

[22]  Milan Randic,et al.  Fitting of nonlinear regressions by orthogonalized power series , 1993, J. Comput. Chem..

[23]  Daniel Cabrol-Bass,et al.  Evaluation in Quantitative Structure-Property Relationship Models of Structural Descriptors Derived from Information-Theory Operators , 2000, J. Chem. Inf. Comput. Sci..

[24]  E. Estrada Spectral Moments of the Edge Adjacency Matrix in Molecular Graphs. Part 3. Molecules Containing Cycles , 1998 .

[25]  Alan R. Katritzky,et al.  Normal Boiling Points for Organic Compounds: Correlation and Prediction by a Quantitative Structure-Property Relationship , 1998, J. Chem. Inf. Comput. Sci..

[26]  E Estrada On the Topological Sub-Structural Molecular Design (TOSS-MODE) in QSPR/QSAR and Drug Design Research , 2000, SAR and QSAR in environmental research.

[27]  N. Trinajstic Mathematical and computational concepts in chemistry , 1986 .

[28]  Ovidiu Ivanciuc,et al.  QSAR Comparative Study of Wiener Descriptors for Weighted Molecular Graphs , 2000, J. Chem. Inf. Comput. Sci..

[29]  W. Watkins,et al.  Efflux pumps: their role in antibacterial drug discovery. , 2001, Current medicinal chemistry.

[30]  M. Cronin,et al.  Pitfalls in QSAR , 2003 .

[31]  A. Roche,et al.  Organic Chemistry: , 1982, Nature.

[32]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[33]  R. García-Domenech,et al.  Virtual combinatorial syntheses and computational screening of new potential anti-herpes compounds. , 1999, Journal of medicinal chemistry.

[34]  Subhash C. Basak,et al.  Topological Indices: Their Nature and Mutual Relatedness , 2000, J. Chem. Inf. Comput. Sci..

[35]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[36]  G. Habermehl Molecular Structure Description , 2001 .

[37]  Chenzhong Cao,et al.  Molecular Polarizability. 1. Relationship to Water Solubility of Alkanes and Alcohols , 1998, J. Chem. Inf. Comput. Sci..

[38]  David T. Stanton Development of a Quantitative Structure-Property Relationship Model for Estimating Normal Boiling Points of Small Multifunctional Organic Molecules , 2000, J. Chem. Inf. Comput. Sci..

[39]  Davor Juretic,et al.  The Structure-Property Models Can Be Improved Using the Orthogonalized Descriptors , 1995, J. Chem. Inf. Comput. Sci..

[40]  A. Persidis High-throughput screening , 1998, Bio/Technology.

[41]  Ernesto Estrada,et al.  Spectral Moments of the Edge Adjacency Matrix in Molecular Graphs, 1. Definition and Applications to the Prediction of Physical Properties of Alkanes , 1996, J. Chem. Inf. Comput. Sci..

[42]  M. Karelson,et al.  Structurally diverse quantitative structure--property relationship correlations of technologically relevant physical properties , 2000, Journal of chemical information and computer sciences.

[43]  F. Torrens Valence topological charge-transfer indices for dipole moments: percutaneous enhancers. , 2004, Molecules.

[44]  Hiren Patel,et al.  A Novel Index for the Description of Molecular Linearity , 2001, J. Chem. Inf. Comput. Sci..

[45]  M. Randic Characterization of molecular branching , 1975 .

[46]  Milan Randic,et al.  Orthogonal molecular descriptors , 1991 .

[47]  E Uriarte,et al.  Recent advances on the role of topological indices in drug discovery research. , 2001, Current medicinal chemistry.

[48]  Milan Randić,et al.  Generalized molecular descriptors , 1991 .

[49]  M. Karelson Molecular descriptors in QSAR/QSPR , 2000 .

[50]  J. Broach,et al.  High-throughput screening for drug discovery. , 1996, Nature.

[51]  Douglas J. Klein,et al.  Wiener Index Extension by Counting Even/Odd Graph Distances , 2001, J. Chem. Inf. Comput. Sci..

[52]  N. Trinajstic,et al.  On the Harary index for the characterization of chemical graphs , 1993 .

[53]  Milan Randic,et al.  Resolution of ambiguities in structure-property studies by use of orthogonal descriptors , 1991, J. Chem. Inf. Comput. Sci..

[54]  Ernesto Estrada,et al.  Spectral Moments of the Edge-Adjacency Matrix of Molecular Graphs, 2. Molecules Containing Heteroatoms and QSAR Applications , 1997, J. Chem. Inf. Comput. Sci..

[55]  A. Balaban,et al.  Reverse Wiener Indices , 2000 .

[56]  Andrey A. Toropov,et al.  Improved Molecular Descriptors Based on the Optimization of Correlation Weights of Local Graph Invariants , 2001 .

[57]  P. Seybold,et al.  Molecular modeling of the physical properties of the alkanes , 1988 .

[58]  F. Albert Cotton,et al.  Advanced Inorganic Chemistry , 1999 .

[59]  Milan Randic,et al.  Optimal Molecular Descriptors Based on Weighted Path Numbers , 1999, J. Chem. Inf. Comput. Sci..

[60]  John H. Van Drie,et al.  Approaches to virtual library design , 1998 .

[61]  G. Romanelli An improved QSAR study of toxicity of saturated alcohols , 2000 .