The Signature Molecular Descriptor. 1. Using Extended Valence Sequences in QSAR and QSPR Studies

We present a new descriptor named signature based on extended valence sequence. The signature of an atom is a canonical representation of the atom's environment up to a predefined height h. The signature of a molecule is a vector of occurrence numbers of atomic signatures. Two QSAR and QSPR models based on signature are compared with models obtained using popular molecular 2D descriptors taken from a commercially available software (Molconn-Z). One set contains the inhibition concentration at 50% for 121 HIV-1 protease inhibitors, while the second set contains 12865 octanol/water partitioning coefficients (Log P). For both data sets, the models created by signature performed comparable to those from the commercially available descriptors in both correlating the data and in predicting test set values not used in the parametrization. While probing signature's QSAR and QSPR performances, we demonstrates that for any given molecule of diameter D, there is a molecular signature of height h </= D+1, from which any 2D descriptor can be computed. As a consequence of this finding any QSAR or QSPR involving 2D descriptors can be replaced with a relationship involving occurrence number of atomic signatures.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  John R. Platt,et al.  Influence of Neighbor Bonds on Additive Bond Properties in Paraffins , 1947 .

[3]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[4]  H. Hosoya Topological Index. A Newly Proposed Quantity Characterizing the Topological Nature of Structural Isomers of Saturated Hydrocarbons , 1971 .

[5]  M. Randic Characterization of molecular branching , 1975 .

[6]  L B Kier,et al.  Derivation and significance of valence molecular connectivity. , 1981, Journal of pharmaceutical sciences.

[7]  László Babai,et al.  Canonical labeling of graphs , 1983, STOC.

[8]  Lemont B. Kier,et al.  A Shape Index from Molecular Graphs , 1985 .

[9]  L. Kier Indexes of molecular shape from chemical graphs , 1987, Medicinal research reviews.

[10]  Lemont B. Kier,et al.  Determination of Topological Equivalence in Molecular Graphs from the Topological State , 1990 .

[11]  S. L. Mayo,et al.  DREIDING: A generic force field for molecular simulations , 1990 .

[12]  M. Randic Novel graph theoretical approach to heteroatoms in quantitative structure—activity relationships , 1991 .

[13]  M. Randic On computation of optimal parameters for multivariate analysis of structure‐property relationship , 1991 .

[14]  Milan Randic,et al.  Resolution of ambiguities in structure-property studies by use of orthogonal descriptors , 1991, J. Chem. Inf. Comput. Sci..

[15]  P. Darke,et al.  Synthesis and antiviral activity of a series of HIV-1 protease inhibitors with functionality tethered to the P1 or P1' phenyl substituents: X-ray crystal structure assisted design. , 1992, Journal of Medicinal Chemistry.

[16]  P. Darke,et al.  HIV-1 protease inhibitors based on hydroxyethylene dipeptide isosteres: an investigation into the role of the P1' side chain on structure-activity. , 1992, Journal of medicinal chemistry.

[17]  Ovidiu Ivanciuc,et al.  Design of topological indices. Part 4. Reciprocal distance matrix, related local vertex invariants and topological indices , 1993 .

[18]  Gerta Rücker,et al.  Counts of all walks as atomic and molecular descriptors , 1993, J. Chem. Inf. Comput. Sci..

[19]  C. Humblet,et al.  A novel nonpeptide HIV-1 protease inhibitor: elucidation of the binding mode and its application in the design of related analogs. , 1994, Journal of medicinal chemistry.

[20]  Jean-Loup Faulon,et al.  Stochastic Generator of Chemical Structure. 1. Application to the Structure Elucidation of Large Molecules , 1994, Journal of chemical information and computer sciences.

[21]  James B. Dunbar,et al.  Novel Series of Achiral, Low Molecular Weight, and Potent HIV-1 Protease Inhibitors , 1994 .

[22]  Takunari Miyazaki,et al.  The complexity of McKay's canonical labeling algorithm , 1995, Groups and Computation.

[23]  Igor I. Baskin,et al.  On the Basis of Invariants of Labeled Molecular Graphs , 1995, J. Chem. Inf. Comput. Sci..

[24]  Subhash C. Basak,et al.  A Comparative Study of Topological and Geometrical Parameters in Estimating Normal Boiling Point and Octanol/Water Partition Coefficient , 1996, J. Chem. Inf. Comput. Sci..

[25]  I. Lukovits The Detour Index , 1996 .

[26]  L. Tong,et al.  Potent HIV protease inhibitors containing a novel (hydroxyethyl)amide isostere. , 1997, Journal of medicinal chemistry.

[27]  Yvonne C. Martin,et al.  The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding , 1997, J. Chem. Inf. Comput. Sci..

[28]  I. Lukovits,et al.  Formulas for the hyper-Wiener and hyper-detour indices of fused bicyclic structures , 1997 .

[29]  M. Randic LINEAR COMBINATIONS OF PATH NUMBERS AS MOLECULAR DESCRIPTORS , 1997 .

[30]  Lowell H. Hall,et al.  Boiling Point and Critical Temperature of a Heterogeneous Data Set: QSAR with Atom Type Electrotopological State Indices Using Artificial Neural Networks. , 1997 .

[31]  István Lukovits,et al.  An All-Path Version of the Wiener Index , 1998, J. Chem. Inf. Comput. Sci..

[32]  Yu Chen,et al.  Evaluation of Quantitative Structure-Activity Relationship Methods for Large-Scale Prediction of Chemicals Binding to the Estrogen Receptor , 1998, J. Chem. Inf. Comput. Sci..

[33]  Jean-Loup Faulon,et al.  Isomorphism, Automorphism Partitioning, and Canonical Labeling Can Be Solved in Polynomial-Time for Molecular Graphs , 1998, J. Chem. Inf. Comput. Sci..

[34]  M Pastor,et al.  Comparative binding energy analysis of HIV-1 protease inhibitors: incorporation of solvent effects and validation as a powerful tool in receptor-based drug design. , 1998, Journal of medicinal chemistry.

[35]  Igor I. Baskin,et al.  Chemical graphs and their basis invariants , 1999 .

[36]  Gerta Rücker,et al.  On Topological Indices, Boiling Points, and Cycloalkanes , 1999, J. Chem. Inf. Comput. Sci..

[37]  Lemont B. Kier,et al.  Intermolecular Accessibility: The Meaning of Molecular Connectivity , 2000, J. Chem. Inf. Comput. Sci..

[38]  Ovidiu Ivanciuc,et al.  QSAR Comparative Study of Wiener Descriptors for Weighted Molecular Graphs , 2000, J. Chem. Inf. Comput. Sci..

[39]  Rucker Walk counts, labyrinthicity, and complexity of acyclic and cyclic graphs and molecules , 2000, Journal of chemical information and computer sciences.

[40]  Alexandre Arenas,et al.  Neural Network Based Quantitative Structural Property Relations (QSPRs) for Predicting Boiling Points of Aliphatic Hydrocarbons , 2000, J. Chem. Inf. Comput. Sci..

[41]  Danail Bonchev,et al.  The Overall Wiener Index-A New Tool for Characterization of Molecular Topology , 2001, J. Chem. Inf. Comput. Sci..

[42]  Milan Randic Graph Valence Shells as Molecular Descriptors , 2001, J. Chem. Inf. Comput. Sci..

[43]  Gerta Rücker,et al.  On Walks in Molecular Graphs , 2001, J. Chem. Inf. Comput. Sci..

[44]  István Lukovits,et al.  Distance-Related Indexes in the Quantitative Structure—Property Relationship Modeling. , 2001 .

[45]  Milan Randic,et al.  On Interpretation of Well-Known Topological Indices , 2001, J. Chem. Inf. Comput. Sci..

[46]  Milan Randic,et al.  A New Descriptor for Structure-Property and Structure-Activity Correlations , 2001, J. Chem. Inf. Comput. Sci..

[47]  Milan Randic,et al.  Novel Shape Descriptors for Molecular Graphs , 2001, J. Chem. Inf. Comput. Sci..

[48]  Jean-Loup Faulon,et al.  Developing a methodology for an inverse quantitative structure-activity relationship using the signature molecular descriptor. , 2002, Journal of molecular graphics & modelling.

[49]  Lemont B. Kier,et al.  An Electrotopological-State Index for Atoms in Molecules , 1990, Pharmaceutical Research.